1 Introduction

It is often claimed that sentential particles (clitics), while base-generated in C0, should stay towards a clause-edge (e.g. Klavans 1985; Embick & Noyer 1999; Anderson 2005; Gobbo et al. 2015, though see e.g. Spencer & Luis 2012 about Bulgarian interrogative li ‘whether’ and references therein). In this paper, I give a full description and an analysis of a sentential clitic which happen to be a counterexample for this generalization.

Distribution of sentential clitics in Khanty (Finno-Ugric), as well as in closely related Mansi, has been for a while of an interest for both Finno-Ugric and general linguistic studies (Nevis 1990; Embick & Noyer 1999; Cysouw 2005). In the present paper, I introduce data from Kazym dialect of Khanty and show that the distribution of the conditional clitic ki is on the surface much more complex than previously assumed for the conditional clitic in Ob-Ugric languages. However, I show that it can be easily explained if ki is displaced at PF, more concretely in the hierarchically organized prosodic structure. The position of ki is defined by prosodic prominence, namely, ki attaches to the most prominent phonological phrase in the clause.

This analysis allows to falsify a generalization about locality of displacement in the PF (e.g. Embick & Noyer 2001). Namely, it is argued that in the PF, clitics attach to the nearest possible host, i.e. to a nearest overtly realized head in the clausal spine, or flip order with the nearest full word. This predicts that phonological clitics cannot skip full heads, when they are seeking for a host. However, it has been argued that Irish pronominal clitics (Bennett et al. 2016) and Tiwa focus clitic (Dawson 2017) are displaced in the prosodic structure and are sensitive to constituents bigger than words, i.e. phonological phrases. This paper intends to add yet another example of PF-dislocation not governed by locality. The Kazym Khanty conditional clitic adds one further type of clitics – sentential clitics – to the set of counterexamples to Embick & Noyer (2001)’s generalization. The findings in this paper also add to a growing body of literature that shows that prosodic prominence seems to play a role in clitic placement more generally (e.g. Bennett et al. 2016; Weisser 2020; submitted).

Khanty is an Ob-Ugric (Uralic) language, spoken in North-West Siberia. It has multiple dialects that have been unified in three main groups – Southern, Eastern, and Northern Khanty. Kazym dialect of Khanty, which is main focus of this study, is one of the Northern Khanty dialects. Relevant for the present study are following features: Khanty is a SOV language with a uniform left-branching phrase formation. It has been argued to be an information-structure configurational language (Nikolaeva 1999; É. Kiss 2021), where subject primarily corresponds to topic and direct object – to new-information focus. However, direct objects can be topical as well. In such cases, they are argued to move out of their base-position and occur ex situ (Smith 2020; É. Kiss 2021). Two complementizers, present in Khanty – ki ‘if’ and kuš ‘though’ (Kaksin 2010) – can occupy various positions in a clause and show certain properties of clitics.

In this paper I concentrate on the conditional marker ki. I show that it can occupy much greater number of positions, than was assumed before – in addition to penultimate, ultimate and second positions, ki can be attached to any constituent in a clause and in some cases phrase-internally. Another proposed generalization, that ki is attached to an emphasized constituent (Embick & Noyer 1999), is fully borne out. Based on the greater number of available positions, as well as on a clear dependency on emphasis, I propose that distribution of ki cannot be analyzed in syntactic terms. Instead, postsyntactic displacement in a hierarchically organized prosodic structure is proposed as a better way to capture this pattern. In a nutshell, I argue that ki occupies a position right-adjacent to the prosodic phrase, which contains the most prominent element in the focus domain.

The paper proceeds as follows: Section 2 describes the empirical generalizations about the placement of ki in a clause. Section 3 present arguments that the placement of ki is to some extend sensitive to syntactic structure. In Section 4, an analysis is proposed, namely that a prosody-based analysis seems to be able to capture the whole pattern of ki placement. Section 5 provides a review of existing and possible alternative analyses of conditional clitics in Ob-Ugric and shows that these fail to capture all the Kazym Khanty data. Section 6 concludes.

2 Empirical data and generalizations

This section gives an overview of core properties and general placement patterns of the conditional marker in Kazym Khanty.

All data in this paper come from original fieldwork conducted online in 2021. For technical reasons, only some sessions were recorded. If not indicated otherwise, examples were constructed by me in Kazym Khanty. Some additional examples were taken from the Western Khanty corpus or from recorded texts, collected by the Moscow research group in the village Kazym. Irrespective of the origin, all examples were double-checked with my language consultants.

Two native speakers of Kazym Khanty participated in this study. They were born in the village Kazym in 1967 and 1969. Kazym Khanty is their first language. Both learned Russian in school with 6–7 years, both have completed school and college education (in Russian). Today they mostly speak Russian, also at home, but use Khanty on occasion with friends.

The research consisted of two stages. At the first stage, consultants were presented with isolated sentences with different positions of ki, asked to read each sentence out loud, judge its grammaticality and translate it into Russian, so I could control for the intended meaning. When several positions for ki were judged grammatical, consultants were asked for their intuitions about meaning differences. At the second stage, consultants were presented sentences with a context: either a short text in Khanty followed by a single target sentence, or a short preface in Russian followed by two minimally different target sentences contrasted to each other (if A, than B; if C, than D). The target sentences were always given in Khanty. If the context was in Khanty, consultants were asked to read both the context and the target sentence together. The contexts in Russian were given by me, and consultants were asked to read the target sentences alone. After reading, they were asked to translate the sentence and to judge both general grammaticality and grammaticality in a given context. For each context, all positions of ki in the corresponding target sentence were checked. Both consultants agreed in their judgements with one exception (marked with %).

Both consultants were aware of the goals of this study and were paid for their participation. Before proceeding to a discussion, I would like to thank my consultants for their help.

2.1 General remarks

The conditional marker ki in Kazym Khanty differs from e.g. English complementizers in that it can occupy various positions in a clause. First of all, it can both precede and follow the clause-final finite verb in the conditional clause. In Kazym Khanty, conditional clauses generally precede the main clause and are verb-final. (1)–(2) illustrate these two positions of ki. The penultimate and ultimate positions are most common ones (cf. Nevis 1990). In (1)–(2), main clauses are identical. This does not mean, however, that these two positions of ki are in free variation. The preverbal (penultimate) position is generally less marked and hence the most common one. I will discuss the distribution of the two positions below in more detail.

In all examples in the paper, the conditional clause is given in square brackets and the host is underlined for convenience. Sometimes, there are multiple instances of ki given in parenthesis. This means that either of these positions, but always strictly one is available in the given clause. ki cannot be omitted in conditional clauses.

    1. (1)
    1. penultimate position
    1. [
    2.  
    1. Pirəś
    2. Old
    1. iki
    2. man
    1. χośəm
    2. fish
    1. jiŋk
    2. water
    1. juχi
    2. prev_home
    1. ki
    2. if
    1. λɛ-s
    2. eat-pst
    1. ],
    2.  
    1. wɛr-λ-aλ
    2. affair-pl-poss.3sg
    1. iśipa
    2. maybe
    1. jăm-ət.
    2. good-pl
    1. ‘If the old man has eaten the fish soup, he seems be doing well.’
    1. (2)
    1. ultimate position
    1. [
    2.  
    1. Pirəś
    2. Old
    1. iki
    2. man
    1. χośəm
    2. fish
    1. jiŋk
    2. water
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ki ],
    2. if
    1. wɛr-λ-aλ
    2. affair-pl-poss.3sg
    1. iśipa
    2. maybe
    1. jăm-ət.
    2. good-pl
    1. ‘If the old man has eaten the fish soup, he seems be doing well.’

As (3)–(5) make evident, ki can also occupy other positions in the middle of a conditional clause, e.g. after the subject (3), the object (4), and an adverbial (5).

    1. (3)
    1. position after the subject
    1. [
    2.  
    1. [Pirəś
    2. Old
    1. iki]
    2. man
    1. ki
    2. if
    1. χośəm
    2. fish
    1. jiŋk
    2. water
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ],
    2.  
    1. wɛr-λ-aλ
    2. affair-pl-poss.3sg
    1. iśipa
    2. maybe
    1. jăm-əλ.
    2. fix-npst
    1. ‘As for the old man, if he has eaten the soup, he seems be doing better.’
    1. (4)
    1. position after the direct object
    1. [
    2.  
    1. Pirəś
    2. Old
    1. ik-en
    2. man-poss.2sg
    1. [χośəm
    2. fish
    1. jiŋk-əλ]
    2. water-poss.3sg
    1. ki
    2. if
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ],
    2.  
    1. ma
    2. I
    1. moj-ɛm
    2. guest-poss.1sg
    1. joχ-t-ɛm
    2. man-pl-poss.1sg
    1. λitəp
    2. caviar
    1. jiŋk-en
    2. water-poss.2sg
    1. λapət-λ-əλλɛm
    2. feed-npst-1sg>sg
    1. ‘If the old man has eaten the fish soup, I will feed my guests with the caviar soup.’
    1. (5)
    1. position after an adverbial
    1. [
    2.  
    1. Puχ-en
    2. son-poss.2sg
    1. sora
    2. fast
    1. ki
    2. if
    1. ow-əλ
    2. door-possP.3sg
    1. lăp
    2. prev_tight
    1. pɛnt-əs
    2. close-pst
    1. ],
    2.  
    1. iśipa
    2. maybe
    1. pɛλna
    2. mosquito
    1. jana
    2. really
    1. ji-s.
    2. become-pst
    1. ‘If the son started to close the door fast, then it should be a lot of mosquitoes outside.’

Note that the position of ki in (1) and (4) can be distinguished by the presence of a preverb. Preverbs are originally adverbial elements, which now are quite similar to verbal particles in Germanic (e.g. Wurmbrand 2000; Dehé 2015). As with Germanic particles, they can be semantically compositional or opaque. In the latter case they add aspectual, perfectivizing semantics to the verb (Zakirova & Muravjev 2019). They usually occupy a position before the verb, separating it from a direct object. I will address the problem of their syntax in the Section 3.1. For the present purposes, it is enough to say that they allow to distinguish a position after a direct object from a penultimate one. As (1) and (4) show, both are available for ki.

Based on the relatively free placement of ki in a clause, it is commonly accepted that ki is a clitic. Indeed, as (6) shows, ki is an enclitic and requires a constituent to its left (but not to its right) to prosodically attach to.1

    1. (6)
    1. [
    2.  
    1. (*ki)
    2. If
    1. ńăχχət-əλ
    2. laugh-npst
    1. (ki)
    2. if
    1. ],
    2.  
    1. muχti
    2. straight_away
    1. weλ-λ-a
    2. kill-npst-pass
    1. pa
    2. and
    1. ńeλ-λ-a
    2. devour-npst-pass
    1. ‘If she laughs, she will be killed and devoured straight away.’
    2. [double-checked from Western Khanty Corpus]2

Summing up, ki can be placed in any position in the clause, as long as this position satisfies the requirement that enclitic is attached to an item to the right. However, “free” distribution of the conditional enclitic ki in Kazym Khanty is not unrestricted.

2.2 Default position

A position before the finite verb is a default, i.e. the most common position for ki in Khanty (see e.g. Nikolaeva 1999; Nevis 1990). “Penultimate” ki has the widest distribution: it follows a wide VoiceP-focus in (7), but is also compatible with e.g. contrastive subjects ((8): wɵntər ‘otter’ vs worš ‘eagle’. Hence, the ambiguous term “default” is used on purpose, and means (i) a position with the widest distribution, and (ii) a position compatible with the most neutral, wide-focus reading of a clause.

    1. (7)
    1. [
    2.  
    1. Puχ-en
    2. Son-poss.2sg
    1. ow-əλ
    2. door-poss.3sg
    1. lăp
    2. prev_tight
    1. ki
    2. if
    1. pɛnt-əs
    2. close-pst
    1. ],
    2.  
    1. išək-ti
    2. praise-nfin.npst
    1. mosəλ.
    2. necessary
    1. ‘If the son has closed the door tight, he should be praised.’
    1. (8)
    1. [
    2.  
    1. Wɵntər
    2. otter
    1. (ki)
    2. if
    1. sora
    2. fast
    1. wɵn
    2. big
    1. sort
    2. pike
    1. nuχ
    2. prev_up
    1. (ki)
    2. if
    1. taλ
    2. drag-[npst]
    1. ],
    2.  
    1. pɵkatńi-ja
    2. bad_weather-dat
    1. iśipa
    2. maybe
    1. ji-λ.
    2. become-npst
    1. [
    2.  
    1. Worš
    2. Eagle
    1. (ki)
    2. if
    1. wɵn
    2. big
    1. sort
    2. pike
    1. nuχ
    2. prev_up
    1. (ki)
    2. if
    1. taλ
    2. drag-[npst]
    1. ],
    2.  
    1. xătəλ
    2. sun
    1. λɵrij-əλ.
    2. ring-npst
    1. ‘It an OTTER has caught a/the big pike, there will be a bad weather. If an EAGLE has caught a/the big pike, there will be a sunny weather.’

As example (8) shows, ki can often choose out of two positions in a clause. There are, however, cases, when penultimate position is not available, as in (9).

    1. (9)
    1. – χot-ɛm
    2. house-poss.1sg
    1. jɵrən
    2. by_force
    1. nuχ
    2. prev_up
    1. wu-s-i.
    2. take-pst-pass
    1. – [ Wuχsar-en
    2. fox-poss.2sg
    1. (ki)
    2. if
    1. jɵrən
    2. by_force
    1. nuχ
    2. prev_up
    1. (*ki)
    2. if
    1. wu-s-λe
    2. take-pst-3sg>sg
    1. ],
    2.  
    1. in
    2. now
    1. imuχ
    2. at_once
    1. kim
    2. prev_outside
    1.  
    2.  
    1. ńɵχλ-əλ-ɛmən.
    2. kick_out-npst-1du>sg
    1. [
    2.  
    1. Pᵾpi
    2. bear
    1. puχ-en
    2. son-poss.2sg
    1. (ki)
    2. if
    1. jɵrən
    2. by_force
    1. nuχ
    2. prev_up
    1. (*ki)
    2. if
    1. wu-s-λe
    2. take-pst-3sg>sg
    1. ],
    2.  
    1. ńot-ti
    2. help-nfin.npst
    1. ănt
    2. neg
    1. armat-λ-əm.
    2. be_able-npst-1sg
    1. ‘(– What happend with your house?) – My house was taken by force. – If the FOX has taken it by force, we kick him out now. If the BEAR has taken it by force, I cannot help.’

The difference between (9) and (8) is in the value of non-contrasted part of the clause. In (9), everything, except the subject is given in the preceding question. But this is not the case in (8). Here, there is no pre-given information. Even repetition in the second sentence is not necessarily expected. Hence, a correct generalization is that ki cannot be attached to a background information. But if a clause has both wide-focus and contrastive topic, ki can attach to either of them.

2.3 Marked positions

As mentioned in the previous subsection, distribution of ki is restricted by information structure. Namely, ki is attached to an emphasized constituent.3 The term emphasis is used in this paper as an umbrella term for contrastive topics and new-information/contrastive foci.4

First, ki cannot be hosted by any item, which is a background information. In (10), the object in the conditional clause (χośəm jiŋk-əλ ‘fish soup’) is contrasted to the object in the matrix clause (λitəp jiŋk-en ‘caviar soup’). Hence, ki cannot be hosted by the non-emphasized, topical subject (pirəś iki ‘old man’).

    1. (10)
    1. [
    2.  
    1. Pirəś
    2. Old
    1. iki
    2. man
    1. *ki
    2. if
    1. [emph
    2.  
    1. χośəm
    2. fish
    1. jiŋk-əλ]
    2. water-poss.3sg
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ],
    2.  
    1. ma
    2. I
    1. moj-ɛm
    2. guest-poss.1sg
    1. joχ-t-ɛm
    2. man-pl-poss.1sg
    1. λitəp
    2. caviar
    1. jiŋk-en
    2. water-poss.2sg
    1.  
    2.  
    1. λapət-λ-əλɛm
    2. feed-npst-1sg>sg
    1. ‘If the old man has eaten the FISH SOUP, I will feed my guests with the CAVIAR SOUP.’5

Secondly, any position, which is not immediately adjacent to the emphasis and not default, is ungrammatical. If an emphasized adverbial precedes a subject, ki cannot be attached to the subject (11). Similarly, ki cannot follow the [Sub Adv]-sequence, if only the subject is emphasized (12).

    1. (11)
    1. [
    2.  
    1. χăλewət
    2. tomorrow
    1. (ki)
    2. if
    1. ńăwrɛm-ət
    2. child-pl
    1. (*ki)
    2. if
    1. aškolaj-a
    2. school-dat
    1. măn-λ-ət
    2. go-npst-3pl
    1. ],
    2.  
    1. tamχatəλ
    2. today
    1. siri-šɛk
    2. early-att
    1. prev_down
    1. at
    2. opt
    1. uλ-λ-ət.
    2. sleep-npst-3pl
    1. ‘If children go to school TOMORROW, let them go to bed a bit earlier today.’
    1. (12)
    1. Kat’-ew
    2. cat-poss.1pl
    1. sora
    2. fast
    1. kim
    2. prev_outside
    1. ki
    2. if
    1. nawrm-əs,
    2. jump-pst
    1. ow-en
    2. door-poss.2sg
    1. ăλ
    2. proh
    1. pɛnt-a.
    2. close-imp.sg>sg
    1.  
    1. [
    2.  
    1. Amp-ew
    2. dog-poss.1pl
    1. (ki)
    2. if
    1. sora
    2. fast
    1. (*ki)
    2. if
    1. kim
    2. prev_outside
    1. (ki)
    2. if
    1. nawrm-əs
    2. jump-pst
    1. ],
    2.  
    1. ow-en
    2. door-poss.2sg
    1. lăp
    2. prev_tight
    1. pλnt-a.
    2. close-imp.sg>sg
    1. ‘If our cat has quickly jumped outside, don’t close the door. (It will come back soon.) If our DOG has quickly jumped outside, close the door tight. (It will be away all night.)’

Similarly, in (13), contrastive emphasis on the subject (imi-λeŋk-en ‘little woman’) is incompatible with ki hosted by direct object (λitəp jiŋk ‘caviar soup’). Here, ki can occupy the preverbal and the subject-adjecent positions in this clause, but cannot be object-adjacent.

    1. (13)
    1. Ew-əλ
    2. Daughter-poss.3sg
    1. λitəp
    2. caviar
    1. ijŋk
    2. water
    1. χošmə-ti
    2. warm-nfin.npst
    1. ănt
    2. neg
    1. χotə-λ.
    2. can-npst
    1. [
    2.  
    1. Imi-λeŋk-en
    2. woman-dim-poss.2sg
    1. (ki)
    2. if
    1. λitəp
    2. caviar
    1. jiŋk
    2. water
    1. (*ki)
    2. if
    1. nik
    2. prev_riverward
    1. (ki)
    2. if
    1. χošməλ-s-əλλe
    2. warm-pst-3sg>sg
    1. ],
    2.  
    1. śirən
    2. then
    1. ɛpλəŋ
    2. tasty
    1. pitə-λ.
    2. become-npst
    1. ‘DAUGHTER cannot warm a caviar soup. If the LITTLE WOMAN warms the caviar soup, it will be tasty.’

Finally, the clause-final position of ki is allowed in two cases.6 (i) In contexts with a narrow focus on verb (14-a). Note that [Vfin + ki] order is not possible, if a direct object is emphasized (14-b).

    1. (14)
    1. a.
    1. [
    2.  
    1. Imi-λeŋk-en
    2. woman-dim-poss.2sg
    1. λitəp
    2. caviar
    1. jiŋk-əλ
    2. water-poss.3sg
    1. nik
    2. prev_riverward
    1. (ki)
    2. if
    1. χošməλ-s-əλλe
    2. warm-pst-3sg>sg
    1. (ki)
    2. if
    1. ],
    2.  
    1. χujt-əw
    2. who-dat
    1. λawəλ-λ-əw,
    2. wait-npst-1pl
    1. λɛ-λ-əw,
    2. eat-npst-1pl
    1. jămt-λ-əw
    2. get_better-npst-1pl
    1. ‘If the woman has WARMED the soup, what are we waiting for, let’s eat, it’ll be good.’
    1.  
    1. b.
    1. [
    2.  
    1. Imi-λeŋk-en
    2. woman-dim-poss.2sg
    1. χošəm
    2. fish
    1. jiŋk
    2. water
    1. jukana
    2. instead
    1. λitəp
    2. caviar
    1. jiŋk
    2. water
    1. (ki)
    2. if
    1. nik
    2. prev_riverward
    1. (ki)
    2. if
    1. χošməλ-s-əλλe
    2. warm-pst-3sg>sg
    1. (*ki)
    2. if
    1. ],
    2.  
    1. jăm
    2. good
    1. śi
    2. foc
    1. λɵλən
    2. let_ptcl
    1. wɵ-s.
    2. be-pst
    1. Ma
    2. I
    1. χošəm
    2. fish
    1. jiŋk
    2. water
    1. λɛ-ti
    2. eat-nfin.npst
    1. ăn
    2. neg
    1. λăηχa-λ-əm
    2. want-npst-1sg
    1. ‘If the woman has warmed CAVIAR SOUP, instead of FISH SOUP, it is good. I don’t want to eat fish soup.’

(ii) When a finite verb is the single overt element in a clause, ki can only be postverbal (15). Whether a clause has wide focus or the verb is narrowly focused, does not play any role. This is due to the enclitic nature of ki and can be seen as a repair operation.

    1. (15)
    1. (*ki)
    2.  
    1. ńăχχət-əλ
    2. laugh-npst
    1. (ki)
    2. if
    1. (muχti
    2. straight_away
    1. weλ-λ-a
    2. kill-npst-pass
    1. pa
    2. and
    1. ńeλ-λ-a).
    2. devour-npst-pass
    1. ‘If she laughs, she will be killed and devoured straight away.’
    2. [double-checked from Western Khanty Corpus]

This subsection shows that ki is sensitive to information structure. Its host should be emphasized, i.e. be either a focus or a contrastive topic. Crucially, ki cannot be hosted by any background element.

2.4 Phrase-internal position

Another restriction to the placement of ki is illustrated in (16)–(17). As shown in Section 2.3, generally ki can directly follow an emphasized item. But not every emphasized item can host ki, e.g. it cannot be attached to a demonstrative (16), or to a complement of postposition (17) even if these have a clearly contrastive interpretation.

    1. (16)
    1. demonstrative
    1. Tᵾm
    2. That
    1. moləpś-en
    2. fur_coat-poss.2sg
    1. wᵾj-e.
    2. take-imp
    1. [
    2.  
    1. Tăm
    2. this
    1. (*ki)
    2. if
    1. moləpś-en
    2. fur_coat-poss.2sg
    1. (ki)
    2. if
    1. λɵmat-λ-ən
    2. put_on-npst-2sg
    1. ],
    2.  
    1. potλajən.
    2. get_cold-npst-pass-2sg
    1. ‘Take THAT coat. If you take THIS coat, you’ll get cold.’
    1. (17)
    1. PP
    1. [
    2.  
    1. Jaj-um
    2. Big_brother-poss.1sg
    1. [λoχs-əλ
    2. friend-poss.3sg
    1. (*ki)
    2. if
    1. piλa]
    2. with
    1. (ki)
    2. if
    1. χuλpɛλa
    2. somewhere
    1. măn-s
    2. go-pst
    1. ],
    2.  
    1. iśipa
    2. probably
    1. wɵnt-a
    2. forest-dat
    1. măn-s.
    2. go-pst
    1. ‘If my brother has gone somewhere with HIS FRIEND, then they must have gone to the woods. (But if he went with the FATHER, they should have gone to the city.)’

This restriction, however, cannot be generalized as a ban of any phrase-internal position. ki can be directly attached to a possessor (18), to a nominal modifier (19), and to an adjective (20). Importantly, this only becomes possible under contrastive emphasis on the hosts. The same seems to be true for numerals though not all native speakers allow ki-hosting numerals as in (21).

    1. (18)
    1. possessor
    1. [ [dp
    2.  
    1. Aś-en
    2. father-poss.2sg
    1. ki
    2. if
    1. λoχs-ət]
    2. friend-pl
    1. juχət-λ-ət
    2. come-npst-3pl
    1. ],
    2.  
    1. owə-λ
    2. door-poss.3sg
    1. nuχ
    2. prev_up
    1. pᵾnš-e.
    2. open-imp
    1. ‘If there come FATHER’S friends, open the door. (If UNCLE’S friends come, call him.)’
    1. (19)
    1. nominal modifier
    1. [ [dp
    2.  
    1. Karti
    2. iron
    1. ki
    2. if
    1. jintəp]
    2. needle
    1. tăχəmə-λ-ən
    2. trow-npst-2sg
    1. ],
    2.  
    1. sɛm-ət
    2. eye-pl
    1. at
    2. let
    1. oλ.
    2. lie-npst
    1. ‘If you throw an IRON needle (and not a needle made of bone), make it be visible’
    1. (20)
    1. adjectival modifier
    1. [ [dp
    2.  
    1. Pirəś
    2. old
    1. iki]
    2. man
    1. ki
    2. if
    1. juχət-λ
    2. come-npst
    1. ],
    2.  
    1. ow-əλ
    2. door-poss.3sg
    1. nuχ
    2. prev_up
    1. pᵾnš-e.
    2. open-imp
    1. [ [dp
    2.  
    1. Aiλat
    2. young
    1. ki
    2. if
    1. iki]
    2. man
    1. juχət-λ
    2. come-npst
    1. ],
    2.  
    1. ow-əλ
    2. door-poss.3sg
    1.  
    2.  
    1. nuχ
    2. prev_up
    1. ăλ
    2. proh
    1. pᵾnš-e
    2. open-imp
    1. pa
    2. add
    1. χosλa
    2. quiet
    1. oms-a.
    2. sit-imp
    1. ‘If an OLD man comes, open the door. If a YOUNG man comes, don’t open the door and sit quiet.’
    1. (21)
    1. numeral
    1. [ [dp
    2.  
    1. Wet
    2. 5
    1. (%ki)
    2. if
    1. sorńi
    2. gold
    1. wuχ]
    2. money
    1. (ki)
    2. if
    1. ma-λ-ən
    2. give-npst-2sg
    1. ],
    2.  
    1. naŋ-ti
    2. you-acc
    1. ɛsλəmt-λ-ɛm.
    2. let_go-npst-1sg>sg
    1.  
    1. [ [dp
    2.  
    1. Jaŋ
    2. 10
    1. (%ki)
    2. if
    1. sorńi
    2. gold
    1. wuχ]
    2. money
    1. (ki)
    2. if
    1. ma-λ-ən
    2. give-npst-2sg
    1. ],
    2.  
    1. naŋ-ti
    2. you-acc
    1. ɛsλəmt-λɛm
    2. let_go-npst-1sg>sg
    1. pa
    2. add
    1. λow-en
    2. horse-poss.2sg
    1. ma-λ-əm
    2. give-npst-1sg
    1. ‘If you give me FIVE golden coins, I’ll let you go. If you give me TEN golden coins, I’ll let you go and give you a horse.’

Importantly, emphasized modifiers stay DP-internal. Though topicalization of a nominal modifier from an object-DP is possible in a conditional clause (22-a), scrambling of a modifier to a position between an indirect object and a subject is ungrammatical (22-c). Hence, I assume that karti in (22-b) does not move out of the DP.7 ki can be hosted by a modifier both in a noun-adjacent position (22-b), and in a topicalized one (22-a).

    1. (22)
    1. a.
    1. [
    2.  
    1. [Karti
    2. iron
    1. ki]i
    2. if
    1. ɛwij-en
    2. daughter-poss.2sg
    1. [dp
    2.  
    1. ti
    2.  
    1. jintəp]
    2. needle
    1. prev_down
    1. pawert-əs
    2. drop-pat
    1. ],
    2.  
    1. magnit-ən
    2. magnet-loc
    1. kănš-e.
    2. search-imp.sg.sg
    1. ‘If the girl has dropped the IRON needle, search for it with a magnet.’
    1.  
    1. b.
    1. [
    2.  
    1. ɛwij-en
    2. daughter-poss.2sg
    1. [dp
    2.  
    1. [karti
    2. iron
    1. ki]
    2. if
    1. jintəp]
    2. needle
    1. prev_down
    1. pawert-əs
    2. drop-pat
    1. ]…
    2.  
    1. ‘If the girl has dropped the IRON needle…’
    1.  
    1. c.
    1. [
    2.  
    1. ɛwij-en
    2. daughter-poss.2sg
    1. (*karti
    2. iron
    1. ki)
    2. if
    1. năŋena
    2. you-DAT
    1. [dp
    2.  
    1. (karti
    2. iron
    1. ki)
    2. if
    1. jintəp]
    2. needle
    1. t-ɵλ
    2. bring-npst
    1. ],
    2.  
    1. tašəŋ-a
    2. rich-adv
    1. ji-λ-ən.
    2. become-npst-2sg
    1. ‘If the girl brings you an IRON needle, you will be rich.’

This data allows for two generalizations: first, ki, and not the host undergoes displacement, since modifiers seem to stay DP-internal, when hosting it. Secondly, a contrastive interpretation of a modifier helps ki to be attached only to possessors, nominal/adjectival modifiers and – for some speakers – numerals, but not to demonstratives or complements.8 From a syntactic point of view, this looks arbitrary: demonstratives, numerals, adjectives and possessors are all considered to be phrasal modifiers (e.g. Svenonius 2008; Dékány 2011; Danon 2012). However, demonstratives pattern together with complements and not with other modifiers. One can argue that demonstratives are structurally different and are not phrasal modifiers, in contrast to adjectives and possessors (consider Dékány (2011)’s analysis of Hungarian non-inflecting demonstratives). However, Khanty demonstratives can be used independently as pronouns (23), which suggests that they should be phrasal (Dékány 2011).

    1. (23)
    1. Aj
    2. small
    1. puχ-en
    2. boy-poss.2sg
    1. aŋki
    2. mother
    1. aśi
    2. father
    1. kasλ-əpt-ijλ-λ-əŋən
    2. wander-caus-ipfv-npst-3du
    1. tum-et
    2. that-pl
    1. χuśa
    2. to
    1. wan-a-šək
    2. near-adv-cmpr
    1. ‘This boy’s parents wander towards that ones.’
    2. [Western Khanty Corpus, orthography adjusted]

In Section 4 I suggest an analysis, based on syntax-prosody mapping, as a way to capture this pattern. I argue that the crucial difference between demonstratives vs numerals, adjectival/nominal modifiers and possessors comes from syntactic positions and different prosodic phrasing options.

2.5 Interim summary

This section summarizes the main facts about the distribution of the conditional marker ki in the Kazym dialect of Khanty.

  • ki behaves as an enclitic, and always attaches to the right to a prosodically stronger element. If a clause has a single overt item, ki can only follow, but not precede it.

  • The position of ki is dependent on the information structure of the clause. Namely, ki is required to occupy a position adjoined to an emphasized (focused/contrasted) XP, e.g. position of ki after a direct object strictly requires a contrastive emphasis on the object, except for cases when it is simultaneously the preverbal position. ki can never be hosted by the background. If a clause has both contrastive emphasis and wide focus, ki can be attached to any of them.

  • The default position of ki is the “penultimate” position, i.e. the position between a preverb and a finite verb. In the next section, I argue that this is not a “penultimate position”, and ki is attached to the VoiceP. This position is available, when VoiceP is new-information focus, and is compatible with narrow/contrastive topic on any of the preceding phrases. This leads to apparent optionality of ki-placement in a clause. Another case of such apparent optionality arises in clauses with emphasis on NP-modifiers. If an NP-modifier (e.g. an adjective) is contrastively emphasized, ki can directly attach to this modifier or follow the whole DP. In Section 4, I argue that in both cases the optionality is resolved before ki is displaced.

In a nutshell, the conditional clitic ki in Kazym Khanty must be right-adjoined to a focused or contrastively topical constituent. When several constituents satisfy this condition, the most prosodically prominent is chosen as a host for ki (more on that in the Section 4).

In the next section, I review properties of Khanty syntax which will be important for the analysis. In Section 4, I propose a prosodic analysis of the placement of ki, that is able to capture the pattern presented in the current section.

3 ki is sensitive to hierarchical structure

Before going to an actual analysis of the ki-placement mechanism, some comments about syntactic structure of Khanty are necessary. In this section I argue that ki is sensitive to hierarchical (syntactic) structure. So far, I have shown that ki has two possible positions: (i) default, adjacent to wide-focus, and (ii) marked, adjacent to emphasis. In what follows I take a closer look at the default case and two marked cases (object-adjacent and DP-internal ki). For all these cases I argue that syntactic movement of the verb or emphasized items is necessary to make the adjacent position of ki available. Note that these instances of syntactic movement are not triggered by ki. They are part of the syntax of Khanty and happen independently from presence or absence of ki in a clause.

3.1 Penultimate position is VoiceP-adjacent

In Section 2.2 it was shown that the penultimate position of ki (=before the finite verb) is a default one. The core argument comes from its compatibility with almost all information-structural configurations. If this position is analyzed in linear terms, two problems arise. First, the ultimate position of ki is only compatible with a narrow focus on the verb. It is unexpected, if we analyse ki-placement in linear terms: when in the ultimate position, ki follows a bigger string of words, however, this position is more restricted. Secondly, under closer investigation, it turns out that the default position of ki is in fact not penultimate, as is evident from contexts with negation (see details below).

Both problems can be solved, if we assume (i) that the finite verb moves out of its base-position and (ii) that a “penultimate” ki is attached to the VoiceP. Hence, when in the ultimate position, ki should be attached to the verb alone, which leads to the requirement on narrow focus on the verb in this case.

As a first step of analysis, the position of the verb in the clause should be defined. The linear position of the finite verb does not provide any information about its position in the tree-structure in a head-final language. Therefore, under possibility of head-movement, verbs cannot be used to track the VP. Preverbs are more reliable as VP-markers.

It was noticed that Germanic verbal particles behave differently, depending on whether they build a semantically compositional or an idiomatic unit with a verb (e.g. Wurmbrand 2000). Semantically non-compositional particles are more restricted in their syntactic behavior, stay in situ and cannot be modified separately from the verb. This is also true for Khanty preverbs, which add perfective interpretation to verbs (Zakirova & Muravjev 2019). (24)–(26) sum up the main tests for the mobility of preverbs and these are all more or less failed.

    1. (24)
    1. topicalization
    1. ?/*Juχii
    2. prev_home
    1. wɵn
    2. big
    1. sort-en
    2. pike-poss.2sg
    1. săm-əλ
    2. heart-poss.3sg
    1. ti
    2.  
    1. λɛ-s-λe
    2. eat-pst-3sg>sg
    1. ‘The big pike ate the/his heart.’
    1. (25)
    1. preverb scrambled over an object
    1. *In
    2. now
    1. imi-leŋk-en
    2. woman-dim-poss.2sg
    1. niki
    2. prev_riverward
    1. λitəp
    2. caviar
    1. jiŋk(-eλ)
    2. water(-poss.3sg)
    1. ti
    2.  
    1. xošməλ-s-əλλe
    2. warm-pst-3sg>sg
    1. Intend.: Now the woman has warmed (up) the caviar soup.
    1. (26)
    1. preverb scrambled over a low adverb
    1. *In
    2. Now
    1. imi-leŋke-n
    2. woman-dim-poss.2sg
    1. λitəp
    2. caviar
    1. jiŋk(-eλ)
    2. water(-poss.3sg)
    1. niki
    2. prev_riverward
    1. sora
    2. fast
    1. ti
    2.  
    1. xošməλ-s-əλλe
    2. warm-pst-3sg>sg
    1. Intend.: Now the woman has fast warmed the caviar soup.

The test in (24) suggests that aspectual preverbs are hesitant to move to the C-domain. The tests in (25)–(26) show that aspectual preverbs cannot move to shorter distances either, e.g. they cannot scramble over an object or a low adverbial. Based on these tests, it can be assumed that aspectual preverbs stay in situ inside the VP, at least when not fronted.9

Since aspectual preverbs stay VP-internally, an element between a preverb and a finite verb, if not base-generated inside the VP, provides an argument for the verb to move out of the VP. There are indeed several elements possible in this position:

    1. (27)
    1. Negation and focus marker
    1. Mᵾŋ
    2. we
    1. nuχ
    2. prev_up
    1. ănt
    2. neg
    1. śi
    2. foc
    1. amăt-s-əw.
    2. get_happy-pst-1pl
    1. ‘We were not happy.’
    1. (28)
    1. Discourse adverbials
    1. In
    2. now
    1. imi-leŋke-n
    2. woman-dim-poss.2sg
    1. λitəp
    2. caviar
    1. jiŋk(-eλ)
    2. water(-poss.3sg)
    1. (aλpa)
    2. probably
    1. nik
    2. prev_riverward
    1. (aλpa)
    2. probably
    1. χošməλ-s-əλλe.
    2. warm-pst-3sg>sg
    1. ‘Now the woman has probably warmed the caviar soup.’

Negation and discourse adverbials can separate the a preverb(=VP) from the overt copy of a finite verb in the linear order. None of these elements is expected to be base-generated inside the VP, or even VoiceP, for semantic reasons. I also assume them to not be clitics and to be displaced in syntax.10

Since preverbs are most likely to stay VP-internal, it should be the verb, that moves out of the VP. Moreover, the verb moves higher than the VoiceP, because clausal negation and discourse adverbails are not expected to lower from their base-position across a phase-boundary to a position inside the VoiceP.

The second necessary step of the analysis is to show that penultimate ki is better defined as VoiceP-adjoined, rather than as penultimate/adjoined to a Vfin. An argument for such analysis comes from negation. (29) shows that ki can surface both before and after negation marker ănt. If unmarked position of ki would be truly penultimate, it would correspond to (29-b) in negative clauses. This is, however, not the case. Native speakers, when asked explicitly about the difference between these two sentences, judge the […prev ki ănt verb] order to be a neutral one. Hence, (29-a) has wide focus interpretation, i.e. the whole VoiceP is focused and ki follows the VoiceP. On the contrary, (29-b), where ki occupies the penultimate position and directly precedes the finite verb, is judged as marked and sounding “like a threat”. I interpret this as an emphasis (narrow focus) on negation.

    1. (29)
    1. a.
    1. [
    2.  
    1. Tuχλ-ɛm
    2. wing-poss.1sg
    1. nuχ
    2. prev_up
    1. ki
    2. if
    1. ănt
    2. neg
    1. wɛr-λ-ən
    2. do-npst-2sg
    1. ],
    2.  
    1. ma
    2. I
    1. năŋ-ti
    2. you-acc
    1. śi
    2. this
    1. art-ən
    2. time-loc
    1. tuχəλ-ən
    2. wing-loc
    1. ănt
    2. neg
    1. to-λ-ɛm.
    2. bring-npst-1sg>sg
    1. ‘If you don’t heal my wing, I will not bring you on my wings now.’
    1.  
    1. b.
    1. [
    2.  
    1. Tuχλ-ɛm
    2. wing-poss.1sg
    1. nuχ
    2. prev_up
    1. ănt
    2. if
    1. ki
    2. neg
    1. wɛr-λ-ən
    2. do-npst-2sg
    1. ]…
    2.  
    1. ‘If you only do NOT heal my wing …’

Since both the preverb and the negation marker can be absent in a clause, ki in the neutral position can end up adjacent to an object and a verb, as illustrated in (30).

    1. (30)
    1. obj (prev) ki (neg) Vfin

Given that the verb moves out of its base position, the order in (30) can be formalized as in (31), with ki being adjacent to a VoiceP-internal material, rather than to the clause-final Vfin.

    1. (31)
    1. [CP … (obj) [VoiceP(obj) (prev) <V>] +ki (neg) v]

This allows to contrast “penultimate” and ultimate ki more explicitly. Contexts with negation allow to argue that “penultimate” ki does not care about the clause-final verb, it rather follows the VoiceP.11 The ultimate ki, on the contrary, looks for a finite verb and attaches to it. This analysis automatically predicts the the ultimate position of ki is restricted to the narrow focus on the verb (see Section 2.2), while VoiceP-adjacent position corresponds to wide focus.

3.2 Direct objects

As was mentioned several times before, ki can immediately follow a direct object, but precede other VoiceP/VP-internal items, e.g. example (4) repeated as (33-b).

    1. (32)
    1. [
    2.  
    1. Pirəś
    2. Old
    1. iki
    2. man
    1. [dp
    2.  
    1. χośəm
    2. fish
    1. jiŋk-əλ]
    2. water-poss.3sg
    1. ki
    2. if
    1. [VoiceP
    2.  
    1. juχi
    2. prev_home
    1. ]
    2.  
    1. λɛ-s
    2. eat-pst
    1. ]…
    2.  
    1. ‘If the old man has eaten the FISH SOUP…’

Non-agreeing direct objects in Khanty are usually treated as staying in situ (e.g. Nikolaeva 1999). However, Smith (2020) and É. Kiss (2021) show that direct objects can move out of their base-position, when they trigger object agreement and have a non-new-information status, i.e. are (secondary) topics. This is not a unique feature of Khanty that scrambling, and short object-scrambling in particular, is related to topicality (e.g. Diesing 1992; Choi 1998; 1999 and references therein for German and Korean, É. Kiss 2003 and references therein for Hungarian, Biskup 2006 for Czech, Bailyn 2001 for Russian, Hinterhölzl 2012 for German, Neeleman & Van de Koot 2008; Schoenmakers et al. 2022 and references therein for Dutch, and many others). The exact landing position of the raised objects in Khanty is not important for the purposes of this paper. Crucially, it is either Spec,VoiceP or higher.

When a direct object hosts ki, it obligatorily moves out of its base position: (33) shows that direct object χośəm jiŋk-əλ ‘fish soup’ obligatorily precedes manner adverbial sora ‘fast’. In clauses where direct object does not host ki, both orders would be possible.

    1. (33)
    1. a.
    1. *[
    2.  
    1. Pirəś
    2. Old
    1. iki
    2. man
    1. sora
    2. fast
    1. [dp
    2.  
    1. χośəm
    2. fish
    1. jiŋk-əλ]
    2. water-poss.3sg
    1. ki
    2. if
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ]…
    2.  
    1. ‘If the old man has eaten the FISH SOUP…’
    1.  
    1. b.
    1. [
    2.  
    1. Pirəś
    2. Old
    1. iki
    2. man
    1. [dp
    2.  
    1. χośəm
    2. fish
    1. jiŋk-əλ]i
    2. water-poss.3sg
    1. ki
    2. if
    1. sora
    2. fast
    1. ti
    2.  
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ]…
    2.  
    1. ‘If the old man has eaten the FISH SOUP…’

Tree in (34) illustrates the grammatical example in (33-b). When hosting ki, object move from its base position to Spec,VoiceP (any maybe higher).

    1. (34)

As (33) and a tree in (34) show, only scrambled objects are able to host ki. Nikolaeva (1999) and É. Kiss (2021) argue that only topical objects can and should scramble out of VP. From this follows that only topical objects are possible hosts for ki. And indeed, as (35) shows, an object is interpreted as contrastive topic, when it hosts ki.

    1. (35)
    1. [
    2.  
    1. Pirəś
    2. Old
    1. ik-en
    2. man-poss.2sg
    1. χośəm
    2. fish
    1. jiŋk-əλ
    2. water-poss.3sg
    1. ki
    2. if
    1. juχi
    2. prev_home
    1. λɛ-s
    2. eat-pst
    1. ],
    2.  
    1. ma
    2. I
    1. moj-ɛm
    2. guest-poss.1sg
    1. joχ-t-ɛm
    2. man-pl-poss.1sg(-dat)
    1. λitəp
    2. caviar
    1. jiŋk-en
    2. water-poss.2sg
    1. λapət-λ-əλλɛm
    2. feed-npst-1sg>sg
    1. ‘(Yesterday, I made fish soup and caviar soup.) If the old man has eaten the FISH SOUP, I will feed my guests with the CAVIAR SOUP.’

Summing up, the order [DO+ki] have specific requirements for information-structural value of the DO. This is parallel to the requirement on NP-modifiers to be contrastively emphasized, when hosting ki (compare Section 2.4). In addition, contrastively topical direct objects differ syntactically from new-information direct objects: topical objects move out of the VoiceP (Smith 2020; É. Kiss 2021). Hence, ki adjoins to an object not inside the VP, but in the scrambled position.

3.3 DP-internal ki

In Section 2.5 I argued that adjectives, numerals and nominal modifiers stay DP-internal, when hosting ki. Here I argue that they additionally move to an outer specifier of the DP.12

As (36-a) shows, an emphasized nominal modifier karti ‘iron’ can be separated from the head noun by a focus particle tɵp ‘only’.

    1. (36)
    1. a.
    1. [
    2.  
    1. ɛwij-en
    2. daughter-poss.2sg
    1. [dp
    2.  
    1. [karti]i
    2. iron
    1. =ki
    2. if
    1. tɵp
    2. only
    1. [ti
    2.  
    1. jintəp]]
    2. needle
    1. prev_down
    1. pawert-əs
    2. drop-pat
    1. ]…
    2.  
    1. ‘If the girl has dropped only the IRON needle…’
    1.  
    1. b.
    1. [
    2.  
    1. ɛwij-en
    2. daughter-poss.2sg
    1. [karti
    2. iron
    1. (*tɵp)
    2. only
    1. ki
    2. if
    1. jintəp]
    2. needle
    1. prev_down
    1. pawert-əs
    2. drop-pat
    1. ]…
    2.  

I assume that tɵp is DP-adjunct, and not a separate projection in the clausal spine (compare similar proposal in Smeets & Wagner (2018) for focus particles in German and Dutch). (i) As shown in Section 2.4, a nominal modifier can be topicalized, but cannot be scrambled out of a DP, and (ii) no NP-modifier can be raised over an adjunct (37).

    1. (37)
    1. a.
    1. high adverbial and subject
    1. *Pirəśi
    2. old
    1. muλχatəλ
    2. yesterday
    1. ti
    2.  
    1. iki
    2. man
    1. juχt-əs
    2. come-pst
    1. Intend.: ‘Yesterday an/the OLD man came.’
    1.  
    1. b.
    1. low adverbial and object
    1. *Karti
    2. iron
    1. (ki)
    2. if
    1. sora
    2. fast
    1. jintəp
    2. needle
    1. tăχəm-əs-n.
    2. throw-pst-2sg
    1. Intend.: ‘If you throw an/the IRON needle fast…’

Since tɵp ‘only’ is a DP-modifier and scopes over the nominal modifier karti ‘iron’ in (36-a), I assume that tɵp is merged above this modifier. After that, karti moves to an outer specifier of the DP, over the only-projection, as shown in (38). In this position a modifier can host the conditional clitic. This explains why the order [modifier only if noun] in (36-b) is not possible. I suggest that such movement of an emphasized modifier to the left-edge of the DP happens every time, when it hosts ki.

    1. (38)

A question arises at this point, why a demonstrative cannot do the same, namely, why it cannot rise to a position, where it would host ki. Apparently, demonstratives are base-generated too high and this movement step is banned as a too short one.

    1. (39)

Note that modifier raising is, apparently, syntactically optional. Native speakers allow two positions for ki in clauses with a contrastive NP-modifier: after the modifier and after the whole DP. I suggest that in cases, when ki is attached to the whole DP, modifier stays in situ, and vice versa, when a modifier moves, ki can be only attached to this modifier, and not to the whole DP.

3.4 Interim summary

At this point a conclusion can be drawn that ki is at least partially sensitive to syntax. The arguments in favor are the following.

First, a “penultimate” position is better defined as VoiceP-adjacent, while the verb moves higher. This is supported by the fact, that DP-internal position (Section 2.4) is restricted by very specific requirements on information structure, while the position between the verb and the preverb (=“penultimate”) has the widest distribution and is compatible with almost any information structure.

Secondly, the ultimate, postverbal position has a strict requirement on narrow focus on the finite verb. In a flat linearized sequence of elements this is rather inexplicable. However, if to assume that placement of ki is determined with the access to hierarchical structure, this pattern can be explained via verb-movement.

Thirdly, position after a direct object, if it does not equal the post-VoiceP (=”penultimate”) position, is only available, if the object is emphasized. Before the linerization, direct object is either in situ and non-emphasized or raised out of the VoiceP. Crucially, objects can only be emphasized in a raised position. Since ki adjoines to emphasized hosts and since in situ objects cannot be emphasized, a direct object is a possible host for ki only when it is raised out of VoiceP.

Fourthly, DP-internal position of ki seem to be connected to raising of low modifiers to an outer specifier of DP. A high modifier, a demonstrative, cannot be raised, because movement step would be too short, and hence is not able to host ki.

At the same time, it should be noticed that a complication for a purely syntactic analysis comes e.g. from phrase-internal position and a position after two separate XPs, where only one is emphasized. If ki is base-generated as a complementizer, these positions cannot be derived with syntactic movement. For the position after two XPs, one would have to postulate a separate movement operation for each XP. However, only movement of the ki-adjacent XP could be properly motivated. DP-internal positions cannot be properly captured either. A model, where we postulate movement of the hosting modifier across ki, overgenerates: it is unable to explain why demonstratives behave differently from other modifiers. In addition, this analysis leads to an island violation: when ki is hosted by a modifier of a DP in subject position, we would need to postulate extraction from a subject island to a position above Co, which is undesirable.

4 Prosodic placement of ki

This section presents an analysis of the conditional clitic in Khanty. Preliminary, the same analysis is expected to extend to Ob-Ugric languages in general. I argue that syntax alone cannot provide means to predict the position of ki since there are no means to derive a phrase-internal position without violating Strict Cycle Condition (Chomsky 1973). An account referring to prosodic structure of a clause is more promising in this respect. Indeed, prosodic structure is based on the syntactic structure, but is not identical to it. Moreover, postsyntactic operations are not restricted by Strict Cycle condition and, hence, are able to capture the phrase-internal position of ki better. In the existing theories of cliticization, there are two main ways to model prosodic displacement of clitics. First option is a Prosodic Inversion operation (Halpern 1995), or a Local Dislocation (Embick & Noyer 1999; 2001). The second approach to prosody-triggered clitic displacement is an Optimality Theoretic approach. In what follows I argue that Prosodic Inversion cannot describe the behavior of ki. An optimality theoretic analysis, however, can and is further supported by e.g. the analysis of Irish pronominal clitics (Bennett et al. 2016), Tagalog second-position clitics (Anderson 2005), focus clitic in Tiwa (Dawson 2017) and agreement clitics in Degema serial verb constructions (Rolle 2020).

4.1 Prosodic phrasing in Khanty

Crucially for the present purposes, prosody has a hierarchical organization (e.g. Truckenbrodt 1995; Selkirk 2009; Selkirk 2011; Samek-Lodovici 2015; Elfner 2018 for discussion). There are four main types of prosodic constituents: prosodic words (ω), prosodic phrases (ϕ), intonational phrase (ι), and utterance phrase (U). They are organized in a hierarchical structure, where prosodic words are included into prosodic phrases and these are included into an intonational phrase and an utterance phrase. Importantly, boundaries of each prosodic and intonational phrase are dependent on syntactic structure. In (40), I provide core mapping principles (after Bennett et al. 2016, adopted from Selkirk 2009; 2011 and Elfner 2012).13

    1. (40)
    1. Core mapping principles [Bennett et al. 2016: (34–35)]
    1.  
    1. i.
    1. Match word
    2. Prosodic words correspond to the heads from which phrases are projected in the syntax (heads that will often have a complex internal structure determined by head movement).
    1.  
    1. ii.
    1. Match phrase
    2. Given a maximal projection XP in a syntactic representation S, where XP dominates all and only the set of terminal elements {a, b, c, …, n}, there must be in the phonological representation P corresponding to S a ϕ-phrase that includes all and only the phonological exponents of a, b, c, …, n.
    1.  
    1. iii.
    1. Match clause
    2. Intonational phrases correspond to those clausal projections that have the potential to express illocutionary force (assertoric or interrogative force, for instance).

These rules predict that prosodic structure mirrors syntactic structure to a certain extent (41-a). I also adopt the view that prosodic phrasing is blind to traces (41-b), as well as binary branching of the prosodic structure (Bennett et al. 2016; Elfner 2018).

    1. (41)
    1. a.
    1.  
    1. b.

Apart from phrase boundaries, both prosodic and intonational phrases are assigned a head, i.e. the most prosodically prominent element inside each phrase. As argued by e.g. Truckenbrodt (1995), Samek-Lodovici (2015) and Ishihara (2011), head alignment is language-specific. Kazym Khanty seems to group with Japanese in this respect. All prosodic units smaller than intonational phrase are left-headed (42), i.e. for each prosodic phrase, the most prominent, high-pitched element is located at the left boundary (attest H*+L contour of the ϕ χošəm ńań ‘fish pie’ in Figures 1 and 2, and Filchenko 2011 for Eastern Khanty).14

Figure 1
Figure 1

Default intonation in a conditional clause with wide focus.

    1. (42)
    1. Head-alignment constraints in Kazym Khanty and Japanese [Ishihara 2011: (7)]
    1.  
    1. a.
    1. Align-ϕ-left = align (ϕ, Left, H(ϕ), Left)
    1.  
    1. b.
    1. Align-ι-right = align (ι, Right, H(ι), Right)

Intonational phrases have been assumed to be left-headed as well (Sosa 2020 for Surgut Khanty). However, my data do not show the characteristic descending pitch contour in wide-focus clauses (attest H* tone on the clause-final verb in Figures 1 and 2). Moreover, Sosa (2020) never controls for pro-drop, when topical subjects and objects are phonologically zero. His left-headed descending intonation seems to be an epiphenomenon of pro-drop, combined with narrow focus on direct objects. Alternatively, high pitch on the finite verb in Figure 1 can be characteristic for conditional clauses, as opposed to main clauses. For now, I assume that phonological phrases in Kazym Khanty are left-headed and intonational phrases are right-headed.

In wide focus clauses, as in Figure 1, each ϕ has descending H*+L or L+H*+L pitch contour (high tone on ńań can be attributed to a tracking error, because it is absent in Figure 2). Aligh-ι-Right ensures that the rightmost ϕ has pitch raising after L-tone at the right-boundary of the previous ϕ.

Another part of the theory, required to model the distribution of ki, is prosodic focus-marking (or emphasis-marking). Féry & Ishihara (2010) and Ishihara (2011) argue in detail that a focused ϕ-phrase is universally marked by a high(er) pitch, or pitch range expansion (Burdin et al. 2015). This holds for different dialects of Khanty (Filchenko 2011; Plotnikov 2021). Compare this with the focus-prominence constraint, introduced by Truckenbrodt (1995) ((43)).

Figure 2
Figure 2

Contrastive focus on the direct object.

    1. (43)
    1. FocusProminence [Truckenbrodt 1995: p.180]:
    2. If F is a focus and DF is its domain, then the highest prominence in DF will be within F.

Sahkai & Mihkla (2017) show that Estonian contrastive topics are marked by an emphatic realization of the pitch accent on the constituent together with a reduced emphasis on the narrow focus. Crucially, prosodic realisation of contrastive topics clearly differs from that of aboutness (background) topics. This is consistent with Khanty data, where contrastive topics, but not background topics, can host ki. Hence, ki-displacement to foci/contrastive topics can be linked to one common prosodic feature, namely prosodic prominence.

The prosodic phrasing constraints, combined with the FocusProminence constraint are enough to predict the host of ki. Moreover, the prosody-based distribution pattern of the Kazym Khanty conditional clitic can be captured in one rule both for focal and topical hosts, as in (44).

    1. (44)
    1. Align(ki, (F)ϕn, Right)
    2. = Align ki to the right edge of (F)ϕn.

Indeed, as in Figure 2, the contrastively focused ϕχošəm ńań’ does host ki and is most prominent.15 As Figure 1 shows, the ϕ which hosts ki have a radical pitch lowering on ki. The pitch range expansion seems to be obligatory, unless the next unit begins with [i] or [j]. In the latter case, pitch only descends to the medium level. Presumably, this is due to linear adjacency of two phonetically identical segments and is motivated by articulation. Crucially, pitch assimilation is a consequence of clitic displacement, since it happens after ki is adjoined.

4.2 Theoretical model

In the current model, displacement of ki is not syntactic and happens in the prosodic component. This assumption is made for the following reasons: (i) lowering in syntax violates the Strict Cycle Condition; (ii) any movement of the conditional marker in syntax affects semantic interpretation, which is not the case for ki.

I assume that the placement of ki is regulated by the constraint in (45). At this stage of research, it is not quite clear whether all prosody-oriented clitics form one natural class and whether (45) can be kept as a universal constraint. For this reason, (45) is formulated as a language-specific constraint. Ideally, it would be true cross-linguistically for all prosody-oriented clitics.

    1. (45)
    1. Align-Ki: (repeated from (44)):
    2. Align(ki, (F)ϕn, Right)

An example for a prosodic displacement of ki to an emphasized object is given in (46-a). If the DP χošəm ńań ‘fish pie’ is emphasized, i.e. contrastively topical, the sentence in (46-a) has underlyingly two structures in (46-b). Recall from Section 3 that topical objects move, and that V is also not in its base position. This is why, when the mapping rules apply, the prosodic phrasing is such that the subject DP, the verb, the object DP and the preverb all form separate ϕs. Finally, the object DP is emphasized and the corresponding ϕ6 is the most prominent ϕ in the clause. Combined with the displacement rule for ki, this prosodic structure requires ki to adjoin to fish pie. In case of a wide focus reading, the topical object still moves and the prosodic structure stays the same. However, now it is the VoiceP-corresponding ϕ7 that contains the highest prominence. Hence, ki would adjoin to ϕ7 and surface in default position after the preverb.

    1. (46)
    1. a.
    1. [
    2.  
    1. Puχ-ɛm
    2. son-poss.1sg
    1. χošəm
    2. fish
    1. ńań
    2. bread
    1. (ki)
    2. if
    1. juχi
    2. prev_home
    1. (ki)
    2. if
    1. λɛ-λ
    2. eat-npst
    1. ]…
    2.  
    1. ‘If my son eats a/the FISH PIE…’
    1.  
    1. b.

The rule in (44) captures not only the post-DO position of ki, as opposed to the “default” position, it also predicts the correlation of intonation and the ki-position after the subject vs default one. In Section 3.1, I have argued at length that the default, “penultimate” position of ki is a post-VoiceP position. This allows to explain (i) why ki precedes a finite verb in a neutral case, and (ii) why this “preverbal” position of ki corresponds to the most neutral, wide-focus interpretation of a clause. Wide focus is universally marked with default (prosodic) prominence (e.g. Féry & Ishihara 2010; Ishihara 2011; Samek-Lodovici 2015 etc.) and is, therefore, able to attract ki, as e.g. in (47). In (47) turn ‘grass’ is nonreferential and indefinite and stays in situ (Nikolaeva 1999; Smith 2020; É. Kiss 2021). This means that ϕ2 corresponds to VoiceP. As an indefinite object, ‘grass’ cannot bear narrow focus/topic interpretation. Hence, the clause has a wide focus interpretation and ϕ2 (=VoiceP) hosts ki.16

    1. (47)
    1. a.
    1. [pro [VoiceP <pro>
    2.  
    1. turn ti]
    2. grass
    1. ki
    2. if
    1. λɛ-λ-əti]
    2. eat-NPST-3PL
    1. tur
    2. throat
    1. χări-λ-aλ
    2. place-PL-POSS.3PL
    1. pa
    2. and
    1. sɵλ-λ-aλ
    2. intestine-PL-POSS.3PL
    1. isa
    2. completely
    1. kăλij-a
    2. blood-DAT
    1. wańś-λ-a-j-ət
    2. cut-NPST-PASS-3PL
    1. ‘If they (reindeer) est grass, they cut throat and stomach till these bleed.’
    2. [double-checked from Western Khanty Corpus]
    1.  
    1. b.

In clauses with contrastive subjects and wide focus on VoiceP, as in (48), there are two ϕs competing in prominence. And indeed, ki can be hosted by any of these ϕs. Though this looks as an optionality in ki-placement, I strongly argue that competition in prominence between the two ϕs is resolved before ki is displaced. Namely, at some early point in the derivation, a speaker can choose, which one wins, i.e. becomes more prominent in the final prosodic structure. If subject has a contrastive focus intonation, i.e. it has a higher pitch and is followed by a pause and a deaccenting of the next ϕ, ki is hosted by the subject as in (48-a). But if ki is hosted by VoiceP as in (48-b), a neutral intonation is expected, namely the ϕ corresponding to VoiceP is the most prominent one in the clause and there is no prosodic emphasis on the subject DP. This is indeed borne out in my data.

    1. (48)
    1. [
    2.  
    1. Wɵnter
    2. Otter
    1. (ki)
    2. if
    1. wɵn
    2. big
    1. sort
    2. pike
    1. nuχ
    2. prev_up
    1. (ki)
    2. if
    1. taλ-əs
    2. drag-pst
    1. ]…
    2.  
    1. ‘If the OTTER has caught a big pike…’
    1.  
    1. a.
    1. [Sub=ki obj prev Vfin]
    1.  
    1. b.
    1. [Sub obj prev=ki Vfin]

Relative placement of ki and clausal negation is another example of how position of ki depends on prosodic phrasing and prominence. In Section 3.1, I have argued that the order […neg ki …] requires emphasis on negation. I assume negation to be merged below AspP.17 If negation is not emphasized, it forms one prosodic unit with Vfin (e.g. as a purely phonological clitic, a ‘leaner’ in Embick & Noyer (1999)’s terms). Since, ϕ3 does not contain emphasis in (49), there are two options left – either the clause has wide focus, or an emphasis lies on the direct object. In the both cases, ki is attached to ϕ2. This corresponds to the surface word order as in (49). However, if negation is emphasized, it is not a ‘leaner’ anymore and forms a separate prosodic phrase, which in turn hosts ki, as (50) shows.

    1. (49)
    1. a.
    1. [
    2.  
    1. Tuχλ-ɛm
    2. wing-poss.1sg
    1. nuχ
    2. prev_up
    1. ki
    2. if
    1. ănt
    2. neg
    1. wɛr-λ-ən
    2. do-npst-2sg
    1. ]…
    2.  
    1. ‘If you don’t heal my wing…’
    1.  
    1. b.
    1. (50)
    1. a.
    1. [
    2.  
    1. Tuχλ-ɛm
    2. wing-poss.1sg
    1. nuχ
    2. prev_up
    1. ănt
    2. neg
    1. ki
    2. if
    1. wɛr-λ-ən
    2. do-npst-2sg
    1. ]…
    2.  
    1. ‘If you only do NOT heal my wing …’
    1.  
    1. b.

I follow Féry & Ishihara (2010) and Ishihara (2011) in that information structure cannot influence prosodic structure pre se. In particular, it has no effect on prosodic phrasing, which is defined by syntactic phrasing only. This plays a role in analysis of ki, adjoined to NP-modifiers and to direct objects. However, this view could be a potential problem for the analysis of […neg ki …] order, because there is no evidence for neg to change its position in syntax. I argue that in its present form, the analysis does not violate the generalization. The contrast in (49) and (50) is between a phonologically defective and full forms of ănt. The phrasing alternation emerges as a byproduct of this contrast (compare similar conclusion in Bennett et al. (2016)).

Finally, one further argument in favor of prosody-based analysis comes from the phrase-internal position of ki. For a syntactic analysis, it is highly problematic to capture the fact that ki is able to follow possessors, adjectives and nominal modifiers, separating it from the rest of the DP, but the same is not possible for demonstratives and complements of postpositions. In Section 3.3 I have shown that adjectives and nominal modifiers move to a higher position inside DP, where they can host ki. Demonstratives, on the other hand, cannot do so, because they are merged too high and the movement would be too local. At this point, it can be suggested that adjectival/nominal modifiers and – for some speakers – numerals are able to form a separate prosodic phrase, because they move to an outer Specifier of the DP. When staying in situ, these NP-modifiers form one ϕ-phrase with the head noun. Demonstratives cannot move and, hence, cannot form a separate prosodic phrase. This goes in line with Ishihara (2011)’s and Féry & Ishihara (2010)’s analysis that prosodic phrasing depends on syntax. Information structure (emphasis) per se cannot change prosodic phasing, e.g. an X cannot form a separate ϕ-phrase only because it is focused, a syntactic movement step is always required for it.

Intriguingly, possessors can host ki, even when embedded in another possessor DP, e.g. in [[[mᵾŋ] wᵾλij-ew] pant] ‘our reindeer’s trail’ (51), which makes it even more complicated for a syntactic movement analysis.

    1. (51)
    1. Amp-əm
    2. Dog-poss.1sg
    1. śi
    2. foc
    1. murt
    2. extent
    1. numsəŋ
    2. smart
    1. wɵ-s.
    2. be-pst
    1. [
    2.  
    1. [Pa
    2. add
    1. χujat
    2. person
    1. wᵾλ-et
    2. reindeer-pl
    1. pant]
    2. trail
    1. ki
    2. if
    1.  
    2.  
    1. mutš-əs
    2. notice-pst
    1. ],
    2.  
    1. ănt
    2. neg
    1. χurtə-s.
    2. bark-pst
    1. [ [dp
    2.  
    1. [ [Mᵾŋ]
    2. 1pl
    1. ki
    2. if
    1. wᵾλij-ew]
    2. reindeer-poss.1pl
    1. pant]
    2. trail
    1. mutš-əs
    2. notice-pst
    1. ],
    2.  
    1. χujat
    2. someone
    1. wox-s.
    2. call-pst
    1. ‘My dog was so smart. If it noticed a trail of someone else’s reindeer, it didn’t bark. If it noticed trail of OUR reindeer, it called someone.’

As for possessors, two analyses are possible: (i) either they are base-generated low enough to be able to move to an outer Specifier of the DP (compare Dékány 2011 for Hungarian), or (ii) they are generated as high as demonstratives (e.g. Pleshak 2018 for Kazym Khanty, Assmann et al. 2013 for Udmurt, Deal 2013 for Nez Perce), but always form a separate ϕ-phrase, as adjuncts (Bárány & Nikolaeva 2021). If to follow the analysis, suggested by Bárány & Nikolaeva (2021) for Tundra Nenets (Samoyedic, Uralic), possessors which trigger a possessive marker on a head-noun are DP-level adjuncts, and not modifiers, while in absence of a possessive suffix on a noun, possessors are merged as modifiers lower in the structure. In both cases a possessor is expected to be able to host ki: (i) as an adjunct it already forms a separate ϕ-phrase in situ, (ii) and as a modifier, it is merged low enough to be able to raise to Spec,DP and form a ϕ there (compare suggestions for possessor-raising to/through a Spec,DP in e.g. Szabolcsi 1983; 1994; É. Kiss 2014; Dékány 2015). This is indeed the case, as (52) shows.

    1. (52)
    1. a.
    1. [
    2.  
    1. [[[Pa
    2. add
    1. χujat]
    2. person
    1. (ki)
    2. if
    1. wᵾλ-et]
    2. reindeer-pl
    1. (ki)
    2. if
    1. pant]
    2. trail
    1. (ki)
    2. if
    1. mutš-əs
    2. notice-pst
    1. ]…
    2.  
    1. ‘If it sees trail of someone’s reindeer…’
    1.  
    1. b.
    1. [
    2.  
    1. [[[Mᵾŋ]
    2. 1pl
    1. (ki)
    2. if
    1. wᵾλij-ew]
    2. reindeer-poss.1pl
    1. (ki)
    2. if
    1. pant]
    2. trail
    1. (ki)
    2. if
    1. mutš-əs
    2. notice-pst
    1. ]…
    2.  
    1. ‘If it sees trail of our reindeer…’

For now I leave an indepth investigation of possessor behaviour in Kazym Khanty for further research and assume that the analysis for Tundra Nenets in Bárány & Nikolaeva (2021) holds in Kazym Khanty as well. This predicts that possessors always have a possibility to form a separate prosodic phrase and, hence, can always host ki. The prediction is indeed borne out.

4.3 OT-analysis

So far I have treated the ki-displacement as a rule-based operation. However, most of the studies, which make use of post-syntactic prosody-triggered clitic displacement are done in the framework of Optimality Theory (OT) (e.g. Truckenbrodt 1995; Selkirk 2011; Bennett et al. 2016; Dawson 2017 etc.). This section intends to illustrate that the distribution of the Kazym Khanty conditional clitic can be successfully modeled in OT. As an example, I take a clause with a contrastively focused possessor in (54). A list of relevant constraints is given in (53).

    1. (53)
    1. a.
    1. MatchPhrase [Bennett et al. 2016: (35)]
    2. Given a maximal projection XP in a syntactic representation S, where XP dominates all and only the set of terminal elements {a, b, c, …, n}, there must be in the phonological representation P corresponding to S a ϕ-phrase that includes all and only the phonological exponents of a, b, c, …, n.
    1.  
    1. b.
    1. MatchClause [Bennett et al. 2016: (34)]
    2. Intonational phrases correspond to those clausal projections that have the potential to express illocutionary force (assertoric or interrogative force, for instance).
    1.  
    1. c.
    1. Head-alignment constraint (Align-ϕ-left):
    2. align(ϕ, Left, H(ϕ), Left)
    1.  
    1. d.
    1. Head-alignment constraint (Align-ɩ-right):
    2. align(ɩ, Right, H(ɩ), Right)
    1.  
    1. e.
    1. FocusProminence (Focus) [Truckenbrodt 1995: p.180]:
    2. If F is a focus and DF is its domain, then the highest prominence in DF will be within F.
    1.  
    1. f.
    1. Align-Ki:
    2. Align (ki, (F)ϕn, Right)
    1. (54)
    1. [ pro [dp [
    2.  
    1. [Mᵾŋ]
    2. 1pl
    1. ki
    2. if
    1. wᵾλij-ew]
    2. reindeer-poss.1pl
    1. pant]
    2. trail
    1. mutš-əs
    2. notice-pst
    1. ]…
    2.  
    1. ‘If it (dog) notices a trail of OUR reindeer…’
    1. (55)
    1. Prosodic phrasing and ki-displacement to a contrastively focused possessor

The Tableau in (55) shows that candidate (a) is indeed the most optimal one, despite the violation of align-ɩ-right. It is the candidate where the embedded possessor builds a separate prosodic phrase because it is raised to a DP-adjoined position. It is the most prominent element in the ι-phrase and hosts ki. If the whole DP, including the raised possessor builds one ϕ, as (c), a violation of MatchPhrase makes it less optimal than (a). If we try to keep the prominence, but attach ki to any other host, as in (d), the structure would have a fatal violation of Align-ki constraint. If a candidate satisfies MatchPhrase and both head-alignment constraints, as in (f), it violates the higher ranked focus. Candidate (b) violates matching- and alignment-constraints and candidate (e) violates all constraints.

Potentially problematic for an OT anaylsis are cases of “optional” placement of ki, e.g. (48) repeated in (56).18 As discussed before, this optionality is resolved before ki is displaced: in order to successfully form a prosodic structure, the prominence hierarchy should be defined, i.e. a single most prominent element should be chosen from several candidates. If competition between two elements is not resolved until the very end, the speaker should make a choice based potentially on some extralinguistic factors, e.g. on the level of emotionality of their speech. Irrespective of how the choice is made, the result is the same – a well-formed prosodic structure with a single most prominent element. Only when this is defined, ki can be displaced.

    1. (56)
    1. [
    2.  
    1. Wɵnter
    2. Otter
    1. (ki)
    2. big
    1. wɵn
    2. pike
    1. sort
    2. if
    1. nuχ
    2. prev_up
    1. (ki)
    2. if
    1. taλ-əs
    2. drag-pst
    1. ]…
    2.  
    1. ‘If the OTTER has caught a big pike…’
    1.  
    1. a.
    1. [Sub=ki obj prev Vfin]
    1.  
    1. b.
    1. [Sub obj prev=ki Vfin]

Another example of the apparent optionality is a DP-internal ki, alternating with a DP-adjacent ki. In (54), ki is hosted by the contrastive possessor mᵾŋ ‘we’, but the same information structure is available, if ki is attached to the whole DP [[[mᵾŋ] wᵾλij-ew] pant] ‘trail of our reindeer’ or to the second level possessor [[mᵾŋ] wᵾλij-ew] ‘our reindeer’, as (57) illustrates.

    1. (57)
    1. [ pro [dp [
    2.  
    1. [Mᵾŋ]
    2. 1pl
    1. (ki)
    2. if
    1. wᵾλij-ew]
    2. reindeer-poss.1pl
    1. (ki)
    2. if
    1. pant]
    2. trail
    1. (ki)
    2. if
    1. mutš-əs
    2. notice-pst
    1. ]…
    2.  
    1. ‘If it (dog) notices a trail of OUR reindeer…’

An example as in (57) looks like true optionality and, hence, is potentially problematic for the OT-model. Nevertheless, I argue that, if the optionality exists, it is resolved before ki is displaced. More concretely, in Section 3.3, it has been shown that whenever an NP-modifier hosts ki, it moves to the outer specifier of the DP. This allows the modifier to form a separate ϕ later in the derivation (see Tableau (55)). However, this movement is not obligatory. Therefore, a DP with a contrastively focused modifier can have two different structures, when it is sent to PF: (i) [DP [NP Modi [NP N]] (D)] and (ii) [DP Modi [NP ti [NP N]] (D)]. In the second case, the modifier moves to an outer Spec,DP. This creates a necessary condition for it to form a separate ϕ and allows it to host ki. In the first case, the modifier does not move, hence, it is not able to form a separate ϕ and cannot host ki separately from the rest of the DP. Since the whole DP corresponds to one ϕ, ki can be attached only to the whole DP. Hence, optionality is resolved in syntax and the OT-model can capture the data without severe problems.

To sum up, prosodic phrasing rules are able to capture the distributional pattern of the conditional enclitic ki in Kazym Khanty. Nevertheless, I want to point out that more empirical research on this topic is required. I leave testing the predictions made by the model in this paper for future work as more detailed research of prosodic phrasing in Khanty is needed.

5 Ruling out alternative analyses

5.1 A second-to-last position clitic (Nevis 1990)

The first formal analysis of Ob-Ugric (Khanty and Mansi) conditional clitic has been suggested by Nevis (1990). He bases his analysis on Klavans’ (1985) theory of cliticization. Therefore, I first give a short summary of Klavans’ theory and then go over to Nevis’ analysis of Khanty and Mansi conditional clitics.

Klavans’ theory decomposes cliticization into three parameters:

    1. (58)
    1. a.
    1. Parameter 1 with values “initial/final” determines whether a clitic attaches to the initial or to the final constituent dominated by a specified XP.
    1.  
    1. b.
    1. Parameter 2 with values “before/after” determines direction of attachment, i.e. whether a clitic follows or precedes a constituent, determined by Parameter 1.
    1.  
    1. c.
    1. Parameter 3 has values “proclitic/enclitic” and is phonological in nature. Its main goal is to diagnose a direction of phonological attachment of a clitic.

There is an immediate consequence: a phonological host of a clitic can differ from a syntactic one, i.e. it predicts existence of ditropic clitics. Said that, the three parameters give rise to eight possible clitic types, represented in Table 1.

Table 1

Klavans’ Clitic Typology.

Parameter 1 initial/final Parameter 2 before/after Parameter 3 proclitic/enclitic Structure for C-host
Type 1 initial before enclitic [C +cl X…
Type 2 initial before proclitic [C cl+X…
Type 3 initial after enclitic [C X+cl…
Type 4 initial after proclitic [C X cl+…
Type 5 final before enclitic …+cl X C]
Type 6 final before proclitic …cl+X C]
Type 7 final after enclitic …X+cl C]
Type 8 final after proclitic …X cl+ C]

Nevis (1990) adopts this classification for a study of Finno-Ugric sentential clitics. Conditional clitic in Khanty and Mansi (both Ob-Ugric) is classified as Type 5/Type 7, depending on its penultimate/ultimate position respectively.

In an ultimate position, ki (=Mansi ke/t’e) is introduced in C. Since Nevis assumes CP to be head-final, C is the last element in the flattened structure. Thus, ki attaches to whatever happens to be linearized before C (parameter 1). It is an enclitic both in syntactic (parameter 2) and in prosodic sense (parameter 3). Hence, the result of cliticization looks as in (59):

    1. (59)

As for penultimate ki, it behaves as a type 5, ditropic clitic: its syntactic host is a final constituent, dominated by C (parameter 1), it attaches to the host on the left (parameter 2), but crucially it is an enclitic, and hence, phonologically it attaches to whatever precedes its syntactic host (parameter 3):

    1. (60)

This analysis has several problems. First, it is merely descriptive and does not provide any information about mechanics of (syntactic) attachment. Secondly, penultimate clitics are in general controversial and argued to be absent (e.g. Embick & Noyer 1999; Anderson 2005). As Anderson (2005: 79) points out: (i) the term ‘penultimate position’ does not give any information about the placement of ki in the syntax; (ii) all known examples of penultimate clitics are in fact before-the-verb clitics. Hence there is an alternative to analyze them as a pre-verbal clitics and not as penultimate ones. Thirdly, “… syntactically -ke [Mansi; =ki in Khanty] is a left-sister to the (inflected) verb, but prosodically is parsed as a suffix with the preceding phonological word. If correct, this analysis requires that -ke be treated as a truly ditropic clitic.” (Embick & Noyer 1999: 300). And the last but most severe problem of this analysis is that it does not take into account non-(pen)ultimate positions of ki, as well as completely ignores its dependence from information structure (emphasis/contrastive topic).

An attempt to solve these problems is suggested by (Embick & Noyer 1999) for the conditional clitic in Mansi (a closest related Ob-Ugric language).

5.2 Phonological cliticization of a proper complementizer (Embick & Noyer 1999)

An alternative analysis of the Ob-Ugric conditional clitic was proposed by Embick & Noyer (1999). This analysis attempted to solve two main problems of Nevis’ analysis: (i) definition of ki as a ditropic clitic, and (ii) existence of alternative, non- (pen)ultimate positions of ki in some specific contexts. In their paper, Embick & Noyer look at conditional clitic in Mansi ke (=t’e = t’i), which has a distribution, pretty similar that to Kazym Khanty ki.

5.2.1 Analysis

The proposed analysis has a crucial difference from the analysis of Nevis (1990): Embick & Noyer assume that Mansi conditional marker ke stays in situ in C-head. They presuppose that CP is head initial, in contrast to the rest of the clause. Hence, all preceding XPs are moved to Spec,CP position in order to be linearized before ke. To make this model work, they make two additional assumptions: first, finite verb moves from V to T, and secondly, negation is adjoined to vP.19 Since CP has a right-branching structure, verb in T follows the ke, as (61) shows. This corresponds to the penultimate position of ke.

    1. (61)
    1. Mansi Clause Structure (Revised) [Embick & Noyer 1999: p.302, ex.77]

Under this analysis ke is a leaner, i.e. it attaches to preceding XP only phonologically. This solves a problem with Nevis’ analysis of ke as a ditropic clitic (for a detailed discussion of conceptual difficulties of ditropic clitics see e.g. Embick & Noyer 1999; Cysouw 2005). Specifically, Embick & Noyer (1999) argue that ditropic clitics are impossible, because they violate Phonological parsing constraint in (62) and create mismatches between phonological and morphosyntactic words.

    1. (62)
    1. Phonological Parsing Constraint: (Embick & Noyer 1999: p.289, ex.54)
    2. If a phonological word ω contains Vocabulary Items inserted into morphemes belonging to distinct MWds (morphosyntactic words), then the edges of ω correspond to MWd-edges.

Since ke in Embick & Noyer’s analysis does not undergo movement, its phonological phrasing with a preceding XP does not conflict with syntactic merger and Phonological parsing constraint is not violated.

For ke being in ultimate position in a clause, an additional operation is proposed. Namely, if ke is final, the linear order [Vfin+C] results from raising of the verb further to C.

    1. (63)
    1. Clause Structure for Clause-final -ke

In order to account for “anomalous” position of ke Embick & Noyer adopt É. Kiss (1994)’ proposal that there is an additional projection E(xpression) above C. The derivation proceeds as follows: ke moves to E and attracts an emphasized phrase to Spec,EP. As shown in (64), this derivation gives a second position of -ke, i.e. position after a first constituent in a clause.

    1. (64)
    1. Clause Structure for Anomalous -ke [Embick & Noyer 1999: p.310, ex.99]

Compared to Nevis (1990)’ proposal, this analysis has an advantage that it is able to account for non-(pen)ultimate positions of the conditional marker, as well as makes the prediction that conditional clitic ke in Mansi (=ki in Khanty) is attached to an emphasized constituent. However, there are several complications left, when trying to apply this analysis to Kazym Khanty data.

5.2.2 Problems

In this Subsection, I address several patterns for ki distribution in Khanty, which cannot be straightforwardly accounted for in Embick & Noyer (1999)’s analysis.

First problem concerns a position of ki after a non-emphasized subject DP and an emphasized low adverbial. In the model presented in (61)–(64) there are only two theoretically available options: either ki directly follows a constituent in Spec,EP, or it stays in situ in C-head and is linearized in the (pen)ultimate position. Since EP is the highest projection in this theory, the subject NP and the adverbial in (65) should both occupy Spec,EP position. But a subject DP and an adverbial do not build a constituent, hence, they should move to multiple Spec,EP independently. However, there is no feature on a non-emphasized subject DP that could trigger that movement. Moreover, the subject and the adverbial have to move string-vacuously, which makes it even harder to model.

    1. (65)
    1. [
    2.  
    1. Puχ-en
    2. son-poss.2sg
    1. sora
    2. fast
    1. ki
    2. if
    1. ow-əλ
    2. door-poss.3sg
    1. lăp
    2. prev_tight
    1. pɛnt-əs
    2. close-pst
    1. ]…
    2.  
    1. ‘If the son started to close the door FAST…’

The same problem arises for a position after a direct object. Model in (64) cannot predict a situation, where ki is not preverbal, but follows two separate constituents. If vP universally raises to Spec,CP, both the subject and the direct object in (66) should have raised out of the vP in an order-preserving way.

    1. (66)
    1. [
    2.  
    1. [Pirəś
    2. Old
    1. ik-en]k
    2. man-poss.2sg
    1. [χośəm jiŋk-əλ]i
    2. fish water-poss.3sg
    1. ki
    2. if
    1. [vP
    2.  
    1. tk
    2.  
    1. juχi
    2. prev_home
    1. ti]
    2.  
    1. λɛ-si
    2. eat-pst
    1. ]…
    2.  
    1. ‘If the old man has eaten the fish soup, (I will feed my guests with the caviar soup).’

É. Kiss (2021) proposes a model of Ob-Ugric clause structure, which suggests exactly this. She argues that Subjects and topical Objects move to two separate projections above TP, which correspond to primary and secondary topic respectively (67). This allows to have order-preserving raising of Subject and Object. Nevertheless, this model fails to save Embick & Noyer’s analysis of the conditional clitic. First, É. Kiss explicitly argues that SubP and ObjP are merged above TP, and not above CP. Hence, a CP-layer, if present, should be merged above SubP, which means that ki is again lowered from C-head. Secondly, É. Kiss’ model only argues for raising of subject and direct object, but low adverbs are expected to stay in situ. Hence, (65) stays problematic.

A third and the most striking complication for the Embick & Noyer’s analysis comes from phrase-internal ki (Section 2.4). Since phrase-internal position is not a (pen)ultimate one, it should be captured by an ‘anomalous derivation’ (64), repeated in (68).

    1. (68)
    1. Clause Structure for Anomalous -ke [Embick & Noyer 1999: p.310, ex.99]

In this derivation, a constituent before ki moves to Spec,EP. For (69), this would predict that the adjective aiλat ‘small’ moves out of the subject DP in order to land in Spec,EP.

    1. (69)
    1. [
    2.  
    1. Aiλat
    2. young
    1. ki
    2. if
    1. iki
    2. man
    1. juχət-λ
    2. come-npst
    1. ],
    2.  
    1. ow-əλ
    2. door-poss.3sg
    1. nuχ
    2. prev_up
    1. ăλ
    2. proh
    1. pᵾnš-e
    2. open-imp
    1. pa
    2. add
    1. χosλa
    2. quiet
    1. oms-a.
    2. sit-imp
    1. ‘If a YOUNG man comes, don’t open the door and sit quiet.’

The same should be assumed for numerals, nominal modifiers, or possessor DPs (for examples see Section 2.4). However, as already discussed in Section 2.4, modifiers stay DP-internal, when hosting ki (recall e.g. example (22-b), repeated in (70)).

    1. (70)
    1. [
    2.  
    1. ɛwij-en
    2. daughter-poss.2sg
    1. [dp
    2.  
    1. [karti]
    2. iron
    1. =ki
    2. if
    1. jintəp]
    2. needle
    1. prev_down
    1. pawert-əs
    2. drop-pst
    1. ]…
    2.  
    1. ‘If the girl has dropped the IRON needle…’

If a modifier in (70) is not moved out of a DP, the same analysis should be available for (69). Hence, both aiλat ‘small’ and karti ‘iron’ should be able to host ki without moving to Spec,EP, contra to what derivation in (68) requires. As far as I can judge, the theory, introduced in Embick & Noyer (1999), does not provide any way to derive a phrase-internal ki and word order as in (70).20

5.3 Morphological analysis

Another alternative to the prosody-based analysis is a postsyntactic displacement in the morphological component (e.g. Embick & Noyer 2001). There is no existing analysis of Ob-Urgic conditional clitic in morphological terms, and I argue that there cannot be any such analysis. This theory allows only two displacement operations, available for clitics in postsyntax: (i) Lowering, and (ii) Local Dislocation. Both of them are incompatible with Kazym Khanty data.

Lowering is a postsyntactic dislocation operation, happening prior to Vocabulary Insertion and flattening of the structure. Hence, it operates on a hierarchical structure. (71) shows that an item X can be lowered from its base-position as a head of a XP to the head of its complement:

    1. (71)
    1. Lowering of X0 to Y0                                                                           [Embick & Noyer 2001: p.561]
    2. [XP X0 … [YP … Y0 … ]] → [XP … [YP … [Y0 Y0+X0] … ]]

Embick & Noyer (2001) argue that Lowering (i) should be local, and (ii) is blind to adjuncts. Both these properties are incompatible with the data from Section 2. First, ki can be hosted by adjuncts, by both low and high adverbials. Secondly, ki in DO-adjacent position, as well as in DP-internal position is dependent on movement of a host to some higher position. This is problematic for Lowering-analysis, because (i) moved element is in a Spec-position, i.e. should be ignored as an adjunct, and (ii) theoretically, no movement is required for lowering to apply, so this dependency would be redundant. Thirdly, if to assume for semantic reasons that conditional clitic is base-generated in C0, it violates locality, if it skips Vfin in Asp (or T) in order to attach to VoiceP.

Local Dislocation takes place later in the derivation, namely simultaneously with Vocabulary Insertion (VI). Embick & Noyer (2001) base their analysis on Late Linearization Hypothesis. Namely, they assume that linearization takes place at VI. Consequently, Local Dislocation operates on a flat structure. Another crucial property of Local Dislocation is that it is local, i.e. can only affect string-adjacent elements. If ki is base-generated in C0, it can never be string adjacent to DO in the following clause:

    1. (72)
    1. [CP <ki> [TP Subj [… DO …[ VP prev]] verb] <ki>]

Hence, the crucial argument against Local Dislocation analysis of ki comes from the fact, that its displacement is not local. Furthermore, I have shown in Section 3 that position of ki seem to be sensitive to hierarchical structure, contra to what is argued for Local Dislocation.

The morphological theory of clitics (Embick & Noyer 2001) explicitly allows for one further purely phonological displacement operation, Prosodic Inversion (Halpern 1995).

    1. (73)
    1. Prosodic Inversion (Halpern 1995: 5)
    2. For a clitic X, which must have a prosodic host to its left (respectively right)
    1.  
    1. a.
    1. if there is a ω, Y, comprised of material which is syntactically immediately to the left (right) of X, then adjoin X to the right (left) of Y
    1.  
    1. b.
    1. else attach X to the right (left) edge of the ω composed of syntactic material immediately to its right (left)

As a prosodic operation, PI is blind to the category of its host, as is ki. However, similarly to Local Dislocation, PI can be only attached to the closest element to its right or to its left. If CP in Khanty is head-final, ki would never be able to escape the ultimate position (74). If CP in Khanty is head-initial, PI can derive a ban on clause-initial position of ki, but it only allows appearance in the second position (75).

    1. (74)
    1. a.
    1. [CP [TP Subj [VoiceP <Subj> Obj <V>] Vfin] =ki]
    1.  
    1. b.
    1. okSubj Obj V=ki
    1.  
    1. c.
    1. *Subj Obj=ki V
    1. (75)
    1. a.
    1. [CP =ki [TP Subj [VoiceP <Subj> Obj <V>] Vfin] ]
    1.  
    1. b.
    1. *=ki Subj Obj V
    1.  
    1. c.
    1. ok Subj=ki Obj V
    1.  
    1. d.
    1. *Subj Obj=ki V

As was discussed throughout this paper, the Khanty conditional clitic can in fact occur in all positions, banned by Prosodic Inversion rule. Prosodic Inversion has the same disadvantage as the Local Dislocation – it is too local. If a clitic is base-generated in C0, it can never occur in the middle of a clause, skipping possible hosts closer to a clause-edge. Hence, PI is insufficient for a description of the behaviour of ki, as well.

5.4 Interim summary

The two existing analyses of the conditional marker in Ob-Ugric (Nevis 1990 and Embick & Noyer 1999) cannot account straightforwardly for all the data in Kazym Khanty. They both are able to account for ultimate and penultimate positions, Embick & Noyer (1999)’ analysis additionally captures second position of ki. However, neither of them can handle ki in third or forth position from the beginning of a clause. And neither of these analyses can handle phrase-internal position of ki. The same is true for the PF-model of clitic displacement in Embick & Noyer (2001). The three operations, available in this analysis, are too restrictive to capture the data. Lowering is unable to skip heads and is blind to adjuncts, unlike ki. Local dislocation and Prosodic Inversion are too local for ki.

6 Conclusion

This paper has a threefold goal. First, it provides a full description of the behaviour of the conditional enclitic ki in Kazym Khanty. Preliminary, the same analysis extends to all Ob-Ugric languages and dialects. So far, only three possible positions of the conditional marker in Ob-Ugric languages have been taken into account: “penultimate”, ultimate, and after a first XP (bold). However, a number of other positions is also possible in Kazym Khanty (bold, underlined) (76).

    1. (76)
    1. [dp (*Dem)/%Num/Adj/nominal modifier/possessor ki N] ki Adverb ki Obj ki prev ki (neg) ki Vfin ki.

A second empirical observation is based on Embick & Noyer (1999)’s generalization, that ki follows an emphasized constituent. In the present analysis, emphasized constituent is formalized as the most prosodically prominent phonological phrase in the prosodic structure. It can be either wide focus (VoiceP), or contrastive topic/focus phrase. Background topic can never be prosodically prominent and can never host ki.

A third empirical observation suggests a better definition of a default position of ki. In previous research (Embick & Noyer 1999; Nevis 1990), a default position was defined as penultimate. I have shown that the correlation of wide focus/emphasis on negation and the position of ki allows to argue that the default position is better defined as a VoiceP-adjoined position, or rather as a position after a prosodic phrase, which corresponds to VoiceP. A closer look at Khanty allows to argue that at least in Ob-Ugric there is no reason to talk about second-to-last-position clitics.

Taken together, these empirical observations allow to falsify two existing analyses of conditional clitics in Ob-Ugric languages.

The second goal of this paper is to suggest a new analysis, which can be more successful in explaining placement of the conditional enclitic in Kazym Khanty (and in Ob-Ugric in general). If placement of ki is postsyntactic and governed by prosodic phrasing and prominence, all available positions for ki can be predicted. It allows to explain the dependence of the position of ki on information structure, namely the requirement to follow an emphasized constituent. An “exceptional” placement of ki after negation and NP-modifiers can be explained as well, as a consequence of a contrastive focus (=emphasis) on these elements, which allows them to single out to a separate prosodic phrase (by inserting a full form of negation marker or moving NP-modifiers to an outer Spec,DP). Though a subsequent empirical study of prosodic phrasing in Kazym Khanty is required, the present analysis allows to capture all the empirical generalizations for Khanty. If the present analysis is correct, it can be taken as a further evidence that there exists a class of prosodic clitics, which are indeed displaced postsyntactically in the hierarchically-shaped prosodic structure. This, in turn, allows for more operations, and importantly for non-local operations in the PF.

The third goal of this paper is to argue for existence of a non-local clitic displacement in the PF. Embick & Noyer (2001) explicitly argue that all clitic displacement operations in the PF are locally restricted, i.e. clitics interact only with the nearest head/word, which makes a possible host. Recently, there were introduced a couple of counterexamples to this generalization (Bennett et al. 2016; Dawson 2017), however, I do not know of any systematic, theory-revisiting study. In this context, it remains important to enrich empirical base of non-local behavior of PF clitics, so that such a study can be done.

Notes

  1. Note that a possible host for ki should be in the same clausal domain, i.e. an item in a matrix clause cannot host ki (i).
      1. (i)
      1. muχti
      2. straight_away
      1. weλ-λ-a
      2. kill-npst-pass
      1. pa
      2. and
      1. ńeλ-λ-a,
      2. devour-npst-pass
      1. [(*ki)
      2. if
      1. ńăχχǝt-ǝλ
      2. laugh-npst
      1. (ki) ]
      2. if
      1. ‘If she laughs, she will be killed and devoured straight away.’
    [^]
  2. Also note that only one copy of ki is allowed per clause. If an example has multiple copies of ki, parentheses indicates that ki can occupy any of these positions, but only one of them. [^]
  3. This is common for conditional markers in Uralic languages (compare e.g. Embick & Noyer (1999), Klumpp & Skribnik 2022: p.1020–1021 for Mansi and Komi). [^]
  4. Klumpp & Skribnik (2022) argue that only focal material can host conditional clitic in Uralic. However, in some examples, topical contrast is preferred, e.g. in (10), two alternatives for the object are presupposed and in (14-b) presence of object conjugation suggests that object is topical (É. Kiss 2021; Klumpp & Skribnik 2022). Hence, my data so far does not allow to argue that (contrastive) topic and focus show any difference in placement of ki: both emphasized (=contrastive) topical subjects and objects, as well as emphasized focal objects have the same requirement on ki to follow them. [^]
  5. In the remainder of the paper emphasized constituents are underlined. [^]
  6. Additionally, the ultimate position of ki is used to express a wish, but I leave such contexts out of the scope of the present study. [^]
  7. Modifiers also never rise over adverbials in Kazym Khanty ((i), also Pleshak 2018).
      1. (i)
      1. *Karti
      2.   iron
      1. (ki)
      2. if
      1. sora
      2. fast
      1. jintǝp
      2. needle
      1. χǝm-ǝs-n.
      2. throw-pst-2sg
      1.   Intend.: ‘If you throw an/the IRON needle fast…’
    [^]
  8. I consider only PPs as a test for complements. At the present state of research, it is not clear, whether DPs in Kazym Khanty can have complements at all (Pleshak 2018). [^]
  9. It is hard to say, whether VP or the preverb alone is topicalized in (24). For the present purposes, however, it is not important. [^]
  10. A detailed discussion of the non-clitical status of discourse adverbials and negation marker ănt is a topic for separate research. Here I provide a couple of arguments, which allow to treat these as non-bound words. First, the adverbial aλpa ‘probably’ is phonologically heavy. Secondly, aλpa can be both a first and a last element in a clause (i), i.e. it does not have a strict requirement to be attached to a host on its right or on its left.
      1. (i)
      1. a.
      1. Aškolaj-ǝλ
      2. School-poss.3sg
      1. jăša
      2. a_bit
      1. pa
      2. add
      1. rɵpit-s,
      2. work-pst
      1. wɛtkɛm
      2. five
      1. year
      1. aλpa
      2. probably
      1. ‘The school worked a bit longer, probably five years.’ [text collection, Moscow research group]
      1.  
      1. b.
      1. aλpa
      2. probably
      1. jăša
      2. a_bit
      1. tărm-əλ
      2. fill-npst
      1. ‘Probably, it will be enough for now.’ [text collection, Moscow research group]
    As for the negation marker ănt in Kazym (Nothern) Khanty, there is no theoretical or descriptive study at the moment (but see Filchenko 2015 for Eastern Khanty). Most commonly, ănt immediately precedes the finite verb and builds one phonological unit with it. There is a possibility to analyze it as a proclitic, hosted by the finite verb. I believe that ănt is not a syntactic clitic. As (27) shows, a focus marker can occur between the negation marker and a verb. As (ii) shows, focus marker is an enclitic, i.e. requires a host to its left.
      1. (ii)
      1. a.
      1.   Piλt
      2.   mist_net
      1. kɵrt-ew
      2. village-poss.1pl
      1. xɵλəm
      2. three
      1. pɛlək
      2. side
      1. sa
      2. ptcl
      1. towij-ən
      2. spring-loc
      1. isa
      2. entirely
      1. jiŋk
      2. water
      1. śi
      2. foc
      1.   ‘In spring, everything on the three sides, where is the encampment with mist nets, is covered with water.’ [Western Khanty corpus]
      1.  
      1. b.
      1. *śi
      2.   foc
      1. aλəŋ
      2. in_the_morning
      1. wɛt
      2. five
      1. mɛwti
      2. crucian
      1. weλt-s-əw.
      2. catch-pst-1pl
      1.   Intend: ‘foc In the morning we’ve caught five crucians.’ [fieldwork]
    An analysis of (27), where both śi and ki are clitics with different direction of attachement seems improbable, because it is unclear, what would be the order of cliticization. Another argument against proper clitic analysis of negation comes from Mansi, where negation can be separated from the verb by a number of full words (iii):
      1. (iii)
      1. at
      2. neg
      1. kwoss
      2. although
      1. kwon
      2. out
      1. kwāl-uŋkwe
      2. go-inf
      1. rōw-i…                         (=(10) Sipőcz 2015: p.196)
      2. may-3sg
      1. ‘Even though he is not allowed to go out…’
    Hence, I assume that both discourse adverbial aλpa ‘probably’ and negation marker ănt are independent elements and not clitics. [^]
  11. There is no syntactic evidence that default position of ki adjoined to VoiceP, and not to VP. For now, I assume that VoiceP-adjoined position is better compatible with the wide scope of the focus. [^]
  12. Recall that an emphasized modifier, that hosts ki, can be topicalized out of the DP (22-a). According to some locality considerations (such as for example PIC (Chomsky 2000; 2001), one could expect an intermediate step at the edge of the DP-phase for low-level modifiers. Nevertheless, a detailed analysis of DP in Kazym Khanty remains a matter of further research. A reviewer also points out that topicalization in conditional clauses is problematic. It is indeed unexpected and contradicts the data from other languages e.g. English and German. Still, my data show that topicalization is available in both main and conditional clauses in Khanty. The problem can be potentially resolved if Khanty has a separate TopP projection, maybe below CP. [^]
  13. There exists another tradition of prosodic research, based on Truckenbrodt (1995). In this tradition mapping principles are formulated differently, e.g. as in (i), however, the resulting prosodic structure is similar in the two traditions. For the present study, it does not make any difference, which mapping principles to use.
      1. (i)
      1. a.
      1. Wrap: Each lexically headed XP is contained inside a prosodic phrase P. [=(15) Samek-Lodovici 2005: 699]
      1.  
      1. b.
      1. align-XP-L = align (XP, L, ϕ, L) ‘Align the left edge of every XP with the left edge of a phonological phrase.’ [=(4) Ishihara 2011: 1872]
    [^]
  14. Khanty has a canonical stress system, where word stress is realized with intensity and length (Kaksin 2010, V. Tyutyunnikova p.c. for Kazym Khanty, Normanskaja 2014 for Nizjam Khanty). Pitch hight only marks heads of prosodic and intonational phrases. [^]
  15. Pauses between ϕs seem to increase pitch height of the following ϕ. This affect the general pitch contour inside ι. As a consequence, ϕ hosting ki and ϕ following a pause can have similar pitch heights. However, this does not affect ki-placement, presumably, because in normal speech tempo, (wide-)focus ϕ bears the highest pitch. [^]
  16. All trees in this paper have head-initial CPs. Kazym Khanty does not have are full-word complementizers (Kaksin 2010), therefore, actual branching direction cannot be easily established. I do not argue that the CP is necessarily head-initial in Khanty, but such a representation makes it plausible that an enclitic needs to dislocate, and cannot be just phonologically attached to a preceding word after linearization. The position of the finite verb in the syntactic structure is an open question for Khanty. However, I assume that it rises at least to Asp, because of the position of negation marker ănt (see below). The same is true for the position of a raised DO. É. Kiss (2021) proposes an ObjP-projection above TP. However, IOs and some adverbials can precede a raised DO, which is apparently incompatible with her analysis. For now I assume the lowest possible rising position. [^]
  17. There are three main alternatives to this analysis: (i) negation is cliticized to finite verb in the syntax; (ii) it is a VP-level adjunct; (iii) it is a clitic, base-generated in NegP right below TP and lowered to AspP. The first alternative is ruled out in the footnote 10. As for the second alternative, I assume it to be out, because negation marker is allowed only in clauses of a size of AspP or bigger (Bikina 2019). As for the analysis in (iii), it is a possible alternative. In this case, it can be additionally proposed that emphasized negation does not undergo lowering (=cliticize) and stays in NegP, while the verb is raised to TP. [^]
  18. I would like to thank an anonymous reviewer for pointing out that optionality problem can be resolved in other versions of OT (Coetzee & Pater 2011), as well as by postulating different syntactic structures (Wagner 2005; 2010; 2015). [^]
  19. I keep the original notation of the authors and do not replace vP with VoiceP throughout this section. [^]
  20. Holmberg (2014) provides an analysis of Finnish question and focus particles. For the fist one he suggests an analysis somewhat similar to Embick & Noyer (1999)’s analysis of Mansi -ke. For the focus particle, he suggests that it can be merged at any place in the structure, as long as it c-commands the focused element. However, such an analysis still cannot drive a position between a modifier and a noun. [^]

Abbreviations

1 = first person, 2 = second person, 3 = third person, 1sg>sg = agreement with 1 person singular subject and a singular object, acc = accusative, add = additive marker, adv = adverb, att = attenuative, caus = causative, cmpr = comparative, cvb = converb, dat = dative, dim = diminutive, du = dual, emph = emphasis, foc = focus particle, imp = imperative, ipfv = imperfective, loc = locative, neg = negation, nfin.npst = non-finite non-past form, npst = non-past tense, opt = optative, pass = passive, pl = plural, poss.2sg = possessive marker for second person singular possessor, prev = preverb, proh = prohibitive, pst = past tense, ptcl = particle.

Acknowledgements

I am grateful to my language speaker consultants who were ready to work online and share their language knowledge with me. I would also like to thank Paula Fenger and Philipp Weisser, without whom this paper would not exist, as well as Daniel Gleim, audience at SOUL-4 and two anonymous reviewers for their valuable comments.

Competing interests

The author has no competing interests to declare.

References

Anderson, Stephen R. 2005. Aspects of the theory of clitics. Oxford University Press on Demand. DOI:  http://doi.org/10.3366/E1750124509000348

Assmann, Anke & Edygarova, Svetlana & Georgi, Doreen & Klein, Timo & Weisser, Philipp. 2013. Possessor Case in Udmurt: Multiple case assignment feeds postsyntactic fusion. Rule Interaction in Grammar. Volume of Linguistische Arbeitsberichte. Institut für Linguistik: Universität Leipzig.

Bailyn, John Frederick. 2001. On scrambling: A reply to Bošković and Takahashi. Linguistic inquiry 32(4). 635–658. DOI:  http://doi.org/10.1162/002438901753373023

Bárány, András & Nikolaeva, Irina. 2021. On adjoined possessors. Linguistic Inquiry 52(1). 181–194. DOI:  http://doi.org/10.1162/ling_a_00370

Bennett, Ryan & Elfner, Emily & McCloskey, James. 2016. Lightest to the right: An apparently anomalous displacement in Irish. Linguistic Inquiry 47(2). 169–234. DOI:  http://doi.org/10.1162/LING_a_00209

Bikina, Darja. 2019. Syntax of non-finite relative clauses in Kazym Khanty: Higher School of Economics, Moscow. MA thesis.

Biskup, Petr. 2006. Scrambling in Czech: Syntax, semantics, and information structure. Proceedings of NWLC 21. 1–15.

Burdin, Rachel Steindel & Phillips-Bourass, Sara & Turnbull, Rory & Yasavul, Murat & Clopper, Cynthia G. & Tonhauser, Judith. 2015. Variation in the prosody of focus in head-and head/edge-prominence languages. Lingua 165. 254–276. DOI:  http://doi.org/10.1016/j.lingua.2014.10.001

Choi, Hye-Won. 1998. Optimizing structure in context: The case of German scrambling. In Western Conference On Linguistics. 56.

Choi, Hye-Won. 1999. Optimizing structure in context: Scrambling and information structure. Stanford: CSLI Publications.

Chomsky, Noam. 1973. Conditions on transformations. In Anderson, Stephen R. & Kiparsky Paul (eds.), A Festschrift for Morris Halle, 232–286. New York: Holt, Reinhart and Winston.

Chomsky, Noam. 2000. Minimalist Inquiries: The Framework. In Martin, Roger & Michaels, David & Uriagereka, Juan & Keyser, Samuel Jay (eds.), Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, 89–155. Cambridge, Mass.: MIT Press.

Chomsky, Noam. 2001. Derivation by Phase. In Kenstowicz, Michael (ed.), Ken Hale: A Life in Language, 1–52. Cambridge, Mass.: MIT Press. DOI:  http://doi.org/10.7551/mitpress/4056.003.0004

Coetzee, Andries W. & Pater, Joe. 2011. The place of variation in phonological theory. The handbook of phonological theory, 401–434. DOI:  http://doi.org/10.1002/9781444343069.ch13

Cysouw, Michael. 2005. Morphology in the wrong place: A survey of preposed enclitics. Amsterdam studies in the theory and history of Libguistic science series 4 264. 17. DOI:  http://doi.org/10.1075/cilt.264.02cys

Danon, Gabi. 2012. Two structures for numeral-noun constructions. Lingua 122(12). 1282–1307. DOI:  http://doi.org/10.1016/j.lingua.2012.07.003

Dawson, Virginia. 2017. Optimal clitic placement in Tiwa. In Proceedings of the Forty-Seventh Annual Meeting of the North East Linguistic Society, vol. 1. 243–256.

Deal, Amy Rose. 2013. Possessor raising. Linguistic Inquiry 44(3). 391–432. DOI:  http://doi.org/10.1162/LING_a_00133

Dehé, Nicole. 2015. Particle verbs in Germanic. In Müller, Peter O. & Ohnheiser, Ingeborg & Olsen, Susan & Rainer, Franz (eds.), Word Formation, An International Handbook of the Languages of Europe. Volume 1: Word-Formation, 611–626. Berlin, München, Boston: De Gruyter Mouton. DOI:  http://doi.org/10.1515/9783110246254-037

Dékány, Éva. 2011. A profile of the Hungarian DP. The interaction of lexicalization, agreement and linearization with the functional sequence. PhD diss., University of Tromsø, Tromsø.

Dékány, Éva. 2015. The syntax of anaphoric possessives in Hungarian. Natural Language & Linguistic Theory 33(4). 1121–1168. DOI:  http://doi.org/10.1007/s11049-014-9278-0

Diesing, Molly. 1992. Indefinites, vol. 20. MIT press Cambridge, MA.

É. Kiss, Katalin. 1994. Sentence structure and word order. In The syntactic structure of Hungarian, 1–90. Brill. DOI:  http://doi.org/10.1163/9789004373174

É. Kiss, Katalin. 2003. Argument scrambling, operator movement, and topic movement in Hungarian. Word order and scrambling, 22–43. DOI:  http://doi.org/10.1002/9780470758403.ch2

É. Kiss, Katalin. 2014. Ways of licensing Hungarian external possessors. Acta Linguistica Hungarica 61(1). 45–68. DOI:  http://doi.org/10.1556/ALing.61.2014.1.2

É. Kiss, Katalin. 2021. What determines the varying relation of case and agreement? Evidence from the Ugric languages. Acta Linguistica Academica 67(4). 397–428. DOI:  http://doi.org/10.1556/2062.2020.00024

Elfner, Emily. 2018. The syntax-prosody interface: Current theoretical approaches and outstanding questions. Linguistics Vanguard 4(1). DOI:  http://doi.org/10.1515/lingvan-2016-0081

Elfner, Emily Jane. 2012. Syntax-prosody interactions in Irish. University of Massachusetts Amherst. DOI:  http://doi.org/10.7275/3545-6n54

Embick, David & Noyer, Rolf. 1999. Locality in post-syntactic operations. MIT working papers in linguistics 34(265–317).

Embick, David & Noyer, Rolf. 2001. Movement operations after syntax. Linguistic inquiry 32(4). 555–595. DOI:  http://doi.org/10.1162/002438901753373005

Féry, Caroline & Ishihara, Shinichiro. 2010. How focus and givenness shape prosody. Information structure from different perspectives 36–63. DOI:  http://doi.org/10.1093/acprof:oso/9780199570959.003.0003

Filchenko, Andrey. 2011. Prosody and pragmatics of Eastern Khanty narratives [prosodika i progmatica vostochno-khantyjskikh narrativov]. Vestnik Tomskogo gosudarstvennogo pedagogicheskogo universiteta 9. 139–145.

Filchenko, Andrey. 2015. Negation in Eastern Khanty. In Miestamo, Matti & Tamm, Anne & Wagner-Nagy, Beáta (eds.), Negation in Uralic languages, 159–190. John Benjamins. DOI:  http://doi.org/10.1075/tsl.108.06fil

Gobbo, Francesca Del & Munaro, Nicola & Poletto, Cecilia. 2015. 15. on sentential particles: A crosslinguistic study 359–386. Berlin, München, Boston: De Gruyter Mouton. DOI:  http://doi.org/10.1515/9783110375572-015

Halpern, Aaron. 1995. On the placement and morphology of clitics. Center for the Study of Language (CSLI).

Hinterhölzl, Roland. 2012. Some notes on scrambling and object shift. Discourse and grammar: A Festschrift in honour of Valeria Molnar 305–321.

Holmberg, Anders. 2014. The syntax of the Finnish question particle. Functional Structure from Top to Toe: The Cartography of Syntactic Structures 9. 266–289. DOI:  http://doi.org/10.1093/acprof:oso/9780199740390.003.0009

Ishihara, Shinichiro. 2011. Japanese focus prosody revisited: Freeing focus from prosodic phrasing. Lingua 121(13). 1870–1889. New insights into the Prosody-Syntax interface: Focus, phrasing, language evolution. DOI:  http://doi.org/10.1016/j.lingua.2011.06.008

Kaksin, Andrej D. 2010. Kazymskij dialect Khantyistogo jazyka. (Russian) [Kazym dialect of Khanty]. Khanty-Mansijsk.

Klavans, Judith L. 1985. The independence of syntax and phonology in cliticization. Language 95–120. DOI:  http://doi.org/10.2307/413422

Klumpp, Gerson & Skribnik, Elena. 2022. Information Structuring. In Bakró-Nagy, Marianne & Laakso, Johanna & Skribnik, Elena (eds.), The Oxford Guide to the Uralic Languages, chap. 54, 1018–1036. Oxford University Press. DOI:  http://doi.org/10.1093/oso/9780198767664.003.0054

Neeleman, Ad & Van de Koot, Hans. 2008. Dutch scrambling and the nature of discourse templates. The Journal of Comparative Germanic Linguistics 11(2). 137–189. DOI:  http://doi.org/10.1007/s10828-008-9018-0

Nevis, Joel Ashmore. 1990. Sentential clitics in Finno-Ugric. DOI:  http://doi.org/10.1515/flin.1990.24.3-4.349

Nikolaeva, Irina. 1999. Ostyak. Lincom Europa.

Normanskaja, Julia. 2014. The System of Accent in the Nizjam and South Dialects of Khanty [Sistema udarenija v nizjamskom dialekte khantyjskogo jazyka i ee paralleli v juzhnokhantyjskom]. Linguistica Uralica 4(50). 283–302. DOI:  http://doi.org/10.3176/lu.2014.4.04

Pleshak, Polina. 2018. Nekotorye suzhety o khantyiskoj IG. (Russian) [some topics about the Khanty NP].

Plotnikov, Ilya. 2021. Intonation system in Surgut dialect of Khanty [Intonatsionnaja sistema surgutskogo dilakta khantyjskogo jazyka]. Jazyki in folklor korennykh narodov Sibiri 2. 25–43. DOI:  http://doi.org/10.25205/2312-6337-2021-2-25-43

Rolle, Nicholas. 2020. In support of an OT-DM model: Evidence from clitic distribution in Degema serial verb constructions. Natural Language & Linguistic Theory 38. 201–259. DOI:  http://doi.org/10.1007/s11049-019-09444-z

Sahkai, Heete & Mihkla, Meelis. 2017. Intonation of Contrastive Topic in Estonian. In Proc. Interspeech 2017. 3181–3185. DOI:  http://doi.org/10.21437/Interspeech.2017-840

Samek-Lodovici, Vieri. 2005. Prosody–syntax interaction in the expression of focus. Natural Language & Linguistic Theory 23(3). 687–755. DOI:  http://doi.org/10.1007/s11049-004-2874-7

Samek-Lodovici, Vieri. 2015. The interaction of focus, givenness, and prosody: A study of Italian clause structure. Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780198737926.001.0001

Schoenmakers, Gert-Jan & Poortvliet, Marjolein & Schaeffer, Jeannette. 2022. Topicality and anaphoricity in Dutch scrambling. Natural language & linguistic theory 40(2). 541–571. DOI:  http://doi.org/10.1007/s11049-021-09516-z

Selkirk, Elisabeth. 2009. On clause and intonational phrase in Japanese: The syntactic grounding of prosodic constituent structure. Gengo Kenkyu 136. 35–73. DOI:  http://doi.org/10.11435/gengo.136.035

Selkirk, Elisabeth. 2011. The syntax-phonology interface. The handbook of phonological theory 2. 435–483. DOI:  http://doi.org/10.1002/9781444343069.ch14

Sipőcz, Katalin. 2015. Negation in Mansi. In Miestamo, Matti & Tamm, Anne & Wagner-Nagy, Beáta (eds.), Negation in Uralic languages, 191–219. John Benjamins. DOI:  http://doi.org/10.1075/tsl.108.07sip

Smeets, Liz & Wagner, Michael. 2018. Reconstructing the syntax of focus operators. Semantics and Pragmatics 11. 6. DOI:  http://doi.org/10.3765/sp.11.6

Smith, Peter W. 2020. Object agreement and grammatical functions: A re-evaluation. Agree to Agree 117. DOI:  http://doi.org/10.5281/zenodo.3541749

Sosa, Sachiko. 2020. Pilot study of intonation units in Khanty discourse. In Scripta miscellanea in honorem Ulla-Maija Forsberg. yomas symyn nékve vortur etpost samyn patum, 369–386. Suomalais-Ugrilainen Seura. DOI:  http://doi.org/10.33341/sus.11.25

Spencer, Andrew & Luis, Ana R. 2012. Clitics: An Introduction (Cambridge Textbooks in Linguistics). Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781139033763

Svenonius, Peter. 2008. The position of adjectives and other phrasal modifiers in the decomposition of DP. Adjectives and adverbs: Syntax, semantics, and discourse 16–42.

Szabolcsi, Anna. 1983. The possessor that ran away from home 3(1). 89–102. DOI:  http://doi.org/10.1515/tlir.1983.3.1.89

Szabolcsi, Anna. 1994. The noun phrase. In The syntactic structure of Hungarian, 179–274. Brill. DOI:  http://doi.org/10.1163/9789004373174

Truckenbrodt, Hubert. 1995. Phonological phrases: their relation to syntax, focus, and prominance: Massachusetts Institute of Technology dissertation.

Wagner, Michael. 2005. Prosody and recursion: Massachusetts Institute of Technology dissertation.

Wagner, Michael. 2010. Prosody and recursion in coordinate structures and beyond. Natural Language & Linguistic Theory 28. 183–237. DOI:  http://doi.org/10.1007/s11049-009-9086-0

Wagner, Michael. 2015. 34. phonological evidence in syntax. In Kiss, Tibor & Alexiadou, Artemis (eds.), Syntax – theory and analysis, volume 2, 1154–1198. De Gruyter Mouton. DOI:  http://doi.org/10.1515/9783110363708-011

Weisser, Philipp. 2020. How Germans move their ‘but’s: A case of prosodic inversion across phrases. In Proceedings of the North East Linguistic Society, vol. 50.

Weisser, Philipp. submitted. A prosodically determined second-position clitic in German: The case of the clause-internal conjunction ‘aber’.

Wurmbrand, Susi. 2000. The structure(s) of particle verbs. Ms.

Zakirova, Aigul & Muravjev, Nikita. 2019. Preverby nǒχ i jǒχi v zapadnykh dialektakh khantyiskogo jazyka: aspektualnyj i diskursivnyj analiz. (Russian) [Preverbs nᵾχ and jᵾχi in the Western dialects of Khanty: an analysis of aspectual and discourse properties.]. Uralo-Altaiskie issledovanija [Uralo-Altaic studies] 4(35). 53–70.