1 Introduction

This paper shows that focus-projection patterns exist in a language that is prosodically and syntactically considerably different from English, for which focus projection is commonly discussed.1 The language is Georgian (Kartvelian). Prosodically, sentential prominence in English is expressed by pitch accents (Pierrehumbert 1980; Selkirk 1984; Ladd 1996), but Georgian uses boundary tones/phrasing instead (Skopeteas & Féry 2010; 2016). Syntactically, narrow foci in English can occupy any position, whereas Georgian foci are immediately preverbal (Skopeteas, Féry & Asatiani 2009; Asatiani & Skopeteas 2012). This paper shows that, nonetheless, the Georgian facts can be accounted for in the Default Prosody tradition of analyzing focus projection. Moreover, the Georgian facts provide the kind of evidence in favor of the Default Prosody approaches (as opposed to F-projection approaches) that cannot be gathered from English.

The basic focus projection facts in English are as follows. In broad-focus utterances, sentential stress, realized as a nuclear pitch accent, targets the (direct) object, while the subject carries a pre-nuclear accent. Utterances with VP- and object focus have the same prosodic realization. Accordingly, the prosodic realization of the object is the same in all three contexts: it carries the nuclear pitch accent (Contreras 1976; Culicover & Rochemont 1983; Selkirk 1984; von Stechow & Uhmann 1986; Cinque 1993; Reinhart 1995; Zubizarreta 1998). In contrast, in subject-focus utterances, subjects carry the nuclear pitch accent. This differentiates them from subjects in broad-focus contexts, which carry pre-nuclear accents. This is exemplified in (1) and schematized in Table 1. In (1), boldfacing indicates the focused constituent, small caps mark prosodically prominent constituents (i.e., those that carry the nuclear pitch accent), double underscoring marks constituents realized in the same way when narrowly focused and in broad focus, and wavy underscoring marks those that are realized differently from each other when narrowly focused and in broad focus.

Table 1
Table 1

The focus-projection pattern.


In contrast with English, Georgian allows for both OV and VO broad-focus utterances, and narrow foci surface immediately preverbally (Pochkhua 1962; Aronson 1982; Skopeteas & Fanselow 2010). Accordingly, in Georgian, an utterance with either subject or object focus can have a string-identical broad-focus minimal pair, as shown in (2–3). This means that the prosodic behavior of subjects and objects, when narrowly focused and in broad focus, can be directly compared in Georgian, without the complication of word order differences. The prosodic organization of Georgian is also markedly different from that of English. Georgian is a ‘phrase language’: it relies on boundary tones/phrasing in signaling information-structural notions like focus, while stress-aligned pitch accents are distributed automatically and do not have the same functional load as pitch accents in English (Skopeteas & Féry 2010; 2016; Vicenik & Jun 2014).

Transferring the pattern in Table 1 to Georgian makes the following predictions. The object in an object-focus utterance should be realized in the same way as one in an SOV broad-focus utterance; this is shown in (2). The subject in a subject-focus utterance should be realized differently from one in an SVO broad-focus utterance; this is shown in (3). As in (1) for English, the two comparisons set up here are (i) between the constituents with double underscoring and (ii) between the constituents with wavy underscoring.

The phenomenon of focus projection is commonly understood in terms of pitch accent distribution (Gussenhoven 1983a; Selkirk 1984; 1995; Ladd 1996, a.o.). Georgian, however, is a ‘phrase language’, which does not rely on pitch accents in focus marking (Skopeteas & Féry 2010; 2016). Nevertheless, the results presented here show that the predictions in (2–3) are borne out in Georgian: the realization of narrowly focused objects matches that of objects in broad-focus utterances, while narrowly focused subjects are realized differently from subjects in broad-focus contexts (in a matching linear position). This is concluded based on several acoustic cues: (i) type of pitch accent and final boundary tone on the focused constituent (as part of the global F0 contour), (ii) height of the final boundary tone, and (iii) duration of the stressed (initial) syllable. F0-based cues were investigated because focus is commonly cued by alignment with tonal events, like pitch accents, or, in phrase languages, prosodic boundaries (Pierrehumbert 1980; Ladd 1996; Féry 2017, a.o.). An increase in duration of the stressed syllable is also known to cue narrow focus, including in phrase languages (Heldner & Strangert 2001; Xu & Xu 2005; Puri 2013, a.o.). With pitch accents not signaling prominence in a phrase language, these cues are the likely candidates for marking focus in Georgian – as well as focus projection.

The results show that both narrowly focused objects and those in broad-focus contexts carry an L* Ha F0 contour, the default one in Georgian, where L* is a low pitch accent and Ha is a high final boundary tone of an Accentual Phrase (Vicenik & Jun 2014). These contexts do not differ in the duration of the stressed syllable, either. In contrast, narrowly focused subjects differ from those in broad-focus contexts: they carry an (L)H* La F0 contour, as opposed to L* Ha, and have greater duration of the stressed syllable. Therefore, this paper shows that a phrase language exhibits a pattern consistent with focus projection, and it does so by relying on acoustic cues that a phrase language is expected to utilize.

Focus projection has long been recognized as theoretically puzzling. It has attracted numerous explanations, which either derive the focus projection facts using a dedicated mechanism (F-projection accounts) or take it to be a by-product of general rules of default prominence distribution in a given language (Default Prosody accounts). In English, a factor that contributes to the two kinds of accounts being difficult to tell apart is the fact that focus prominence and default prominence are expressed by the same means, a nuclear pitch accent. In Georgian, however, focus prominence and default prominence are expressed with different F0 contours, (L)H* La and L* Ha, respectively (and, as will be shown, default prominence overrides focus prominence in object-focus contexts). This provides a novel testing ground for evaluating the existing approaches to focus projection. The results show that the Default Prosody approaches, but not the F-projection ones, can account for the Georgian facts.

This paper is structured as follows. Section 2 introduces the phenomenon of focus projection – existing theoretical approaches (2.1) and instrumental studies (2.2) – and provides initial hypotheses for focus projection in phrase languages (2.3). Section 3 discusses the relevant facts of Georgian syntax (3.1), prosody (3.2), and prosodic focus marking (3.3), and makes Georgian-specific predictions (3.4). Section 4 introduces the methods: stimuli (4.1), procedure and participants (4.2), data processing (4.3), and analysis (4.4). Section 5 reports the results of a production study: F0 properties of focused constituents (5.1) and duration of the stressed syllable (5.2), and closes with a summary (5.3). Section 6 provides the discussion. Section 7 concludes.

2 Focus projection

2.1 Theoretical approaches

The responses to questions in (4) are prosodically identical. They show that all questions in (4), each of which selects for a focused constituent of a different size (bracketed), can be felicitously answered with an utterance that carries prosodic prominence/sentential stress (indicated with small caps) on the rightmost word, sweater. Sentential stress is realized as an (intonational) nuclear pitch accent (Ladd 1996). This prosodic ambiguity between focus on a constituent and a sub-constituent is known as focus projection. A nuclear pitch accent on the word sweater may be interpreted as signaling broad focus, VP-focus, or narrow focus on the object.

(4) a. What happened? (Nini knitted a sweater).
b. What did Nini do? Nini (knitted a sweater).
c. What did Nini knit? Nini knitted (a sweater).

In contrast with objects, narrow focus on the subject does not lead to such focus ambiguity: the subject carries the nuclear pitch accent, which, intuitively, is incompatible with broad focus.2 To illustrate, in (5), only (d) can be felicitously answered with an utterance with the nuclear pitch accent on the subject. As in (4), bracketing marks the focused constituent, and the word carrying the nuclear pitch accent is in small caps.

(5) a. What happened? #(Nini knitted a sweater).
b. What did Nini do? #Nini (knitted a sweater).
c. What did Nini knit? #Nini knitted (a sweater).
d. Who knitted a sweater? (Nini) knitted a sweater.

Numerous analyses of focus projection have been developed. In this section, the existing accounts are presented according to the classification in Arregi (2016).3 The existing accounts of focus projection form two camps: the Default Prosody ones and F-projection ones. All accounts subscribe to some variant of (6):

(6) Focus Prominence
The focused constituent must carry the strongest prosodic prominence in a certain domain (typically, the sentence).

The two camps diverge with respect to the rules that supplement Focus Prominence. According to the Default Prosody analyses, default principles of prominence distribution in a sentence also govern prominence distribution in a focused constituent – and create focus-projection patterns. Focus projection, thus, is epiphenomenal. In particular, the ambiguity between the responses in (4) results from an ambiguity between ‘contrastive stress’ on the object, sweater, and ‘normal (default) stress’, specified by rule for every sentence (Newman 1946). This intuition also became the basis for the Nuclear Stress Rule (NSR; Chomsky & Halle 1968), according to which sentential stress in a broad-focus utterance (in VO languages) is on the rightmost constituent. In the Default Prosody accounts, some version of the NSR supplements the rule of Focus Prominence, as exemplified in (7) with Jackendoff’s (1972) analysis. As formulated, Default Prominence can also account for the cross-linguistic variation in prominence distribution. For instance, in Basque, it is the preverbal constituent that receives sentential stress, not the rightmost one; focused constituents are also preverbal (Hualde, G. Elordieta & A. Elordieta 1994; Arregi 2002, a.o.).

(7) Default Prominence
a. Within a focused constituent, prosodic prominence is determined by default principles of prosody that are independent of focus.
b. Default prominence in English is determined by the cyclic Nuclear Stress Rule.

In (4), according to the Default Prosody approaches, focus projection takes place because in each of broad, VP-, and object focus, Default Prominence assigns sentential stress to the rightmost constituent, sweater, which leads to prosodic ambiguity between nested foci of different sizes. Focus projection simply results from the fact that expanding the size of the focused constituent to the left does not change the position of sentential stress (rightmost). In contrast, in (5), Nini, the focused subject, receives sentential stress. No ambiguity arises in responses (5a–c) because they do not comply with the requirements of Default Prominence for broad, VP-, and object focus, in terms of sentential stress placement.

Jackendoff’s (1972) approach was later disputed (e.g., based on its non-inclusion of factors related to givenness and argument structure; Ladd 1980; Gussenhoven 1983a, a.o.), but the key intuition that default prominence distribution drives focus projection was prominent in subsequent literature. The rule of Default Prominence has been derived from word order (Ladd 1980; Culicover & Rochemont 1983, a.o.), argument structure (Schmerling 1976; Gussenhoven 1983a; Zubizarreta 1998; Büring 2006, a.o.), syntactic depth of embedding (Cinque 1993), height in a syntactic domain (Szendrői 2001), or phase-by-phase computation (Bresnan 1971; Legate 2003; Kahnemuyipour 2004; Adger 2007; Kratzer & Selkirk 2007).

In contrast with the Default Prosody accounts, F-projection approaches calculate relative prominence of adjacent syntactic constituents, based on their structural positions. Focus projection is accounted for with the help of dedicated F-projection rules, which arise from the distribution of pitch accents in a sentence. The latter is rooted in information structure as opposed to the rules of default prominence (Selkirk 1984; 1995; Rochemont 1986; von Stechow & Uhmann 1986). This is illustrated in (8–9), based on the most prominent F-projection account, Selkirk (1984; 1995):

(8) Basic F-rule
An accented word is F-marked.
(9) F-projection rules
a. Head Projection
F-marking of the head of a phrase licenses the F-marking of the phrase.
b. Argument Projection
F-marking of an internal argument of a head licenses the F-marking of the head.

According to the F-projection rules, F-marking can spread ‘horizontally’ from the complement of XP to the head X0 (Argument Projection), then ‘vertically’ to the whole XP (Head Projection), and so on, thereby creating a succession of nested foci. This accounts for the ambiguity in (4): when the object is accented, F-projection rules derive different focus sizes, given that an accented object can ‘spread’ F-marking up the syntactic tree. The lack of focus projection in (5) is also accounted for: an accented subject does not license F-marking of the verb because the subject is not its syntactic sister. Accordingly, F-marking is not projected any further than the subject itself (cf. also von Stechow & Uhmann 1986; Selkirk 1995). An additional set of rules governs F-marking in embedded contexts and the interaction of F-marking with givenness.

An advantage of F-projection analyses over Default Prosody ones is that they make more detailed predictions about the correlation between the distribution of pitch accents and information structure, which turn out to be correct. On the other hand, later work showed that F-projection rules wrongly limit F-projection to heads and internal arguments, because F-projection from modifiers is possible as well (Büring 2006). Default Prominence was shown to be indispensable for the correct analysis of focus projection, based e.g. on the difficulty to define the relation between F-marking in a phrase and a particular constituent in it (Jacobs 1988; Schwarzschild 1999; Büring 2006).

2.2 Existing instrumental studies

The judgements about whether responses like those in (4) are indeed ambiguous have been subject to empirical scrutiny. The question guiding these investigations, on English and German, has been whether the prosodic ambiguity between nested foci of different sizes, which has been reported in e.g. Jackendoff (1972) and Selkirk (1984), can be demonstrated instrumentally. Several studies on English found no differences between the prosody of object focus and broad focus, as focus projection accounts predict: in perception experiments, listeners did not reliably distinguish replies to questions eliciting broad or VP-focus from object focus (Gussenhoven 1983b; Birch & Clifton Jr. 1995; Welby 2003). On the other hand, Rump and Collier (1996) and Bishop (2010) found that listeners can reliably tell object focus from broad focus in a perception study. Breen et al. (2010) also concluded that object- and broad-focus utterances differ in gradient phonetic cues, such as mean and maximum F0 values, intensity, and duration. In German, Baumann et al. (2006; 2007) found that speakers differentiate object focus from broad focus with variable patterns of upstep, downstep, and duration.

In different work, it has been shown that focus projection does not obtain in some Romance languages. In Neapolitan Italian declaratives, narrowly focused objects carry a different pitch accent (L+H*), as compared to objects in broad-focus utterances (H+L*) (D’Imperio 1997; 2000). Similarly, in European Portuguese, sentence-final objects bear a falling accent with an early peak (H+L*) under broad focus, but a falling accent with a late peak (H*+L) under narrow focus (Frota 2000; 2012).

2.3 Focus projection in a phrase language

With respect to phrasal/sentential prosody, two language types are commonly recognized: those that mark sentential prominence culminatively, via alignment and identity of pitch accents, anchored to metrically strong syllables, and those that mark it demarcatively, via alignment of prominent syntactic constituents with prosodic boundaries (Jun 2005). Following Féry (2017), the former are called “intonation languages with lexical stress” (henceforth “intonation languages”), while the latter are “phrase languages”. To signal information structure, phrase languages add/delete phrase boundaries at constituent edges. The resulting phrases may but do not have to carry pitch accents; the distribution of pitch accents alone does not necessarily reflect information-structural notions (e.g., focused vs. given) (Skopeteas & Féry 2016).

Phrase languages form two groups (Féry 2017: 269). In those with no evidence for word stress, like French or Korean, pitch accents may be distributed in prosodic constituents larger than a prosodic word, without reference to metrically strong syllables (Vaissière 1983; Jun & Fougeron 1995). In phrase languages with (weakly implemented) lexical stress, like Hindi or Georgian, pitch accents are present, but phrasal tones are predominant. The latter language type is the object of study in this paper (more on the prosodic make-up of Georgian in Section 3.2). In addition to phrase-language status, Hindi and Georgian share another prosodic property: nuclear pitch accent/sentential stress has been explicitly described as absent from Hindi (Bansal 1976: 27) and Georgian (Dzidziguri 1954; Alkhazishvili 1959; Zhghenti 1963; 1965). When asked, Georgian speakers who took part in the present study also reported no consistent intuitions about sentential stress. This is unlike English speakers, whose intuitions about nuclear stress placement converge, as demonstrated by broad agreement about the facts in the literature since at least Newman (1946).

These properties of phrase languages – pitch accents not used to mark information structure and lack of sentential stress – make them difficult to discuss from the perspective of focus projection, because, as shown in Section 2.1, it is precisely these notions that are used to model focus projection. While the existing accounts of focus projection do not make explicit cross-linguistic predictions, such predictions can be formulated. In fact, the two types of approaches discussed in Section 2.1 make different predictions about the availability and/or prosodic expression of focus projection in phrase languages. According to the Default Prosody approaches, exemplified in (7) above, the expression of default prominence is not explicitly tied to the notions of pitch accent or nuclear stress: rather, the main idea is that the default principles of prominence distribution govern focus projection. Viewed from the cross-linguistic perspective, this means that the phonetic expression of focus projection may be subject to cross-linguistic variation. Accordingly, Default Prosody approaches give rise to the DF (= Default Prosody for Focus) Hypothesis:

(10) DF Hypothesis:
Focus projection relies on language-specific means of prominence marking (e.g., prosodic boundaries) in languages that do not utilize pitch accents to mark prominence.

In contrast, the canonical F-marking approach, Selkirk (1984; 1995), explicitly defines F-marking as a function of accenting, as was shown in (8). This stems from Selkirk’s (1984; 1995) analysis of sentential prosody, according to which placement of pitch accents (as opposed to other tonal targets, e.g. boundary tones) is of primary importance both for prosodic structure building and for focus marking. This is because pitch accents, stress-aligned, signal prominence in languages like English. Therefore, it is not surprising, that, in Selkirk’s account, a fundamental link between pitch accent placement and F-marking is also established.4 Selkirk’s account was not meant to be English-specific (see, e.g., Kenesei & Vogel 1989 on Hungarian), but it is clear that it can only make predictions for intonation languages – i.e., those that utilize pitch accents in an English-like way. Accordingly, it gives rise to the FP (= Focus Projection) Hypothesis:

(11) FP Hypothesis:
Focus projection is established with the help of pitch accents.

What would focus projection in a phrase language look like? The key insight behind Focus Prominence in (6), which all focus projection accounts recognize, is that focused constituents should be prosodically prominent. In intonation languages, prosodic prominence is expressed with pitch accents (Cohen & ’t Hart 1967; Bolinger 1970, a.o.). In contrast, phrase languages, in marking prosodic and information-structural categories, use boundary tones as an alternative to canonical prominence marking (Féry 2013). Therefore, the DF Hypothesis predicts that phrase languages may signal focus projection with prosodic boundaries. Concretely, the same realization of the direct object (with respect to the prosodic boundary that it is aligned with) under narrow focus and in broad focus would be consistent with focus projection. Similarly, different realizations of subjects under narrow focus and in broad focus, with respect to the same prosodic parameters, would also be consistent with focus projection. The FP Hypothesis, in turn, excludes phrase languages from those in which focus projection may be found.

Phrase languages have not been systematically studied from the point of view of focus projection – except Hindi, for which some, albeit indirect, evidence is available.5 In Hindi, the existence and placement of word stress and the distribution of pitch accents are debatable, whereas phrase-final boundary tones are prominent. The literature shows that focus is marked by being right-adjacent to a prosodic boundary, which is not expressed in F0 but in other cues, like glottal stop insertion and the direction of cliticization (Patil et al. 2008; Féry 2010; Féry, Pandey & Kentner 2016). Though the literature does not discuss it explicitly, the reported F0 contours that span the constituents of interest from the perspective of focus projection – narrowly focused subjects and objects, and those in broad-focus contexts – exhibit a pattern that is consistent with focus projection, with the help of tonal targets. Specifically, objects in [SOV]F and SOFV are intonationally identical: they have rising contours of the same shape and F0 height. In contrast, the rising contours on subjects in SFOV have significantly higher final F0 values than those on subjects in [SOV]F (and are accompanied by strong post-focal pitch target suppression). This pattern of focus marking is consistent with focus projection, and with the DF Hypothesis. The availability of such patterns in some phrase languages calls for the investigation of other languages of the same profile, such as Georgian.

3 Georgian: the basics

3.1 Relevant syntactic properties

Georgian allows for considerable flexibility of word order, including both OV and VO in broad-focus declaratives (Vogt 1971; Skopeteas & Fanselow 2010). Following the majority of authors (Pochkhua 1962; Nash 1995; Harris 2000, a.o.), I take OV to be underlying, and VO to be derived by short, semantically vacuous verb movement; see Skopeteas & Fanselow (2010) and Borise (2019) for evidence. Georgian exhibits split case marking. In the present and imperfective tenses, subjects carry nominative and objects carry dative; in the aorist, “active” (transitive and unergative) subjects carry ergative, and “inactive” (unaccusative) subjects and objects carry nominative. The structural position of the subject co-varies with case-marking: nominative subjects are generated in Spec, VoiceP, ergative ones in Spec, vP, and dative ones in Spec, ApplP; they receive case in situ and do not raise to Spec, TP. The subject, therefore, is never the most deeply embedded constituent; this will be relevant for the discussion in Section 6. Nominative and dative objects are generated within the VP (Legate 2008; Nash 2017; Thivierge 2019). The fact that, in a transitive clause (OV or VO), the object is the most deeply embedded constituent will also become relevant in Section 6. Focused constituents appear immediately preverbally, as shown in (12). This holds for all types of narrow foci: those in replies to wh-questions, contrastive foci in corrective replies, and those modified by only and even.6

    1. (12)
    1. a.
    1. (‘Who cleaned the kitchen yesterday?’)
    1. Guʃin
    2. yesterday
    1. bebia
    2. grandma.nom
    1. a-lag-eb-d-a
    2. ver-clean-sf-sm-ipfv.3sg
    1. samzareulo-s.
    2. kitchen-dat
    1. Grandma cleaned the kitchen yesterday.’
    1. b.
    1. (‘When did grandma clean the kitchen?’)
    1. Bebia
    2. grandma.nom
    1. guʃin
    2. yesterday
    1. a-lag-eb-d-a
    2. ver-clean-sf-sm-ipfv.3sg
    1. samzareulo-s.
    2. kitchen-dat
    1. ‘Grandma cleaned the kitchen yesterday.’

Syntactic evidence suggests that preverbal narrow foci are found in situ, and adjacency with the verb results from “altruistic” movement of the intervening material (Borise 2019).7

3.2 Prosodic properties

The prosody of Georgian is relatively well-studied. The properties of word stress have long been debated, but there is consensus that (i) in di- and trisyllabic words stress is initial, and (ii) in longer words, in addition, secondary stress may occur on the antepenult or penult (Selmer 1935; Robins & Waterson 1952; Alkhazishvili 1959; Tevdoradze 1978). Experimental evidence demonstrates that Georgian has fixed initial stress, cued by greater duration as compared to other syllables/vowels (Vicenik & Jun 2014; Borise & Zientarski 2018; Borise 2019).

Many rules of Georgian phrasal prosody have been established, from the realization of neutral statements to that of questions and narrow focus (Tevdoradze 1978; Bush 1999; Müller 2005; Skopeteas et al. 2009; 2018; Skopeteas & Fanselow 2010; Skopeteas & Féry 2010; 2014; 2016; Asatiani & Skopeteas 2012). Vicenik & Jun (2014) provide a comprehensive Autosegmental-Metrical analysis of Georgian prosody, establishing the levels of prosodic phrasing and inventory of F0 targets. Unless noted otherwise, the rest of this section summarizes their analysis.

Autosegmental-Metrical (AM) theory is one of the main approaches to prosodic analysis, developed in Liberman (1975), Bruce (1977), and Pierrehumbert (1980).8 The main tenet of the AM theory is that intonation can be modelled as a sequence of pitch targets, aligned with specific hosts in the abstract prosodic structure, and transitions between them (interpolation). The values of pitch targets can be high (H) or low (L). These labels are not absolute and take into account the speaker’s pitch range and the surrounding pitch targets. There are several types of pitch targets. Pitch accents are anchored to stressed syllables (e.g., H*, L*), and boundary tones are aligned with edges of prosodic domains (e.g., H%, L%). Complex pitch targets consist of several tones (typically two). In a complex pitch accent, the main pitch target, aligned with the stressed syllable, is asterisked, with leading or trailing tones preceding or following it (e.g., L+H*, L*+H). Another pitch target, phrase accent, has been analyzed either as a boundary tone for medium-level prosodic phrases, or as a distinct pitch target found between the last pitch accent and the final boundary tone (Bruce 1977; Pierrehumbert 1980; Ladd 1983; Grice, Ladd & Arvaniti 2000). Smaller prosodic units, such as prosodic words, are grouped into larger prosodic units, such as prosodic phrases and intonational phrases. In a given language, two or three levels of prosodic phrasing are distinguished (Shattuck-Hufnagel & Turk 1996). Pitch accents are assigned within smaller prosodic phrases, while all types of prosodic phrases can carry initial and/or final boundary tones.

According to Vicenik & Jun (2014), the smallest prosodic phrase in Georgian is an Accentual Phrase (AP). By default, an AP spans a single lexical word (plus postposition/discourse particle, if present). In an all-new, broad-focus declarative utterance, each AP, except for the right-most one, carries a rising F0 contour. Vicenik & Jun (2014) analyze it as a low pitch accent L* on the initial (stressed) syllable of the AP and a high final boundary tone on the final syllable, Ha (‘a’ stands for ‘AP’). This intonational contour is illustrated in Figure 1; see also Skopeteas and Féry (2010) and Skopeteas, Féry & Asatiani (2009: 112).9

Figure 1
Figure 1

Typical broad-focus intonation in Georgian.

    1. (13)
    1. Giorgi-s
    2. Giorgi-dat
    1. mosts’on-s
    2. like-prs.3sg
    1. dzalian
    2. very
    1. lamaz-i
    2. beautiful-nom
    1. gogo
    2. girl.nom
    1. Tbilisi-dan.
    2. Tbilisi-from
    1. ‘Giorgi likes a very beautiful girl from Tbilisi.’

The key property of an AP is the presence of one pitch accent, on the initial syllable, and a final boundary tone. This holds independently of the information-structural status of an AP.10 An utterance-initial AP may also carry a falling F0 contour, H* La; this is analyzed as semantically vacuous variation by Skopeteas & Féry (2016). Similarly, Vicenik & Jun (2014) take the H* La tonal pattern to indicate that the AP is semantically/syntactically related to the next one, but not to encode a distinct information-structural notion. As shown in Section 3.3 and in the results of the current study, the H* La contour is also used to mark subject focus. Note that H* may take a shape of a peak or high plateau (Vicenik & Jun 2014).

A prosodic constituent larger than an AP is an Intermediate Phrase (ip). Optional ip-formation in Georgian applies to two or three syntactically/semantically related APs, such as a noun and an adjective/demonstrative. Obligatory ip-formation is found in certain focus contexts; more on this in Section 3.3. Each of the APs within an ip carries its own pitch accent. The H* La tonal pattern is used on the non-ip-final APs, to signal the relationship between the ip-internal APs. The ip-final AP carries an L* pitch accent and a H- or L- final boundary tone, as shown in Figure 2. The ip-boundary tone overrides that of the final AP. The largest prosodic constituent, an Intonational Phrase (IP), corresponds to a clause and carries a final low boundary tone L%, as shown in Figure 1 and Figure 2.

Figure 2
Figure 2

A realization of an utterance containing an ip lamazma kalbat’onma ‘a beautiful lady’.

    1. (14)
    1. Lamaz-ma
    2. beautiful-erg
    1. kalbat’on-ma
    2. lady-erg
    1. k’aba
    2. dress.nom
    1. mo-i-zom-a.
    2. prv-ver-try-aor.3sg
    1. ‘A beautiful lady tried on a dress.’

The existing descriptions demonstrate that the distribution and functions of pitch accents in Georgian differ from those in typical intonation languages in several ways. First, each AP (i.e., in most contexts, each lexical word) obligatorily carries a pitch accent. Second, the choice of a pitch accent (e.g., H* or L*) is not determined by information structure: instead, it tracks semantic/syntactic units, or may be semantically vacuous. These characteristics of Georgian make it a phrase language – i.e., one that primarily relies on boundary tones and phrasing as opposed to stress-aligned pitch accents to signal information-structural notions (Skopeteas & Féry 2010; 2016; Féry 2017). Note that the value of the boundary tone often co-varies with the value of a pitch accent (L* Ha, H* La).

With respect to syntax-prosody mapping, Georgian has been subject to conflicting analyses. This is likely because, as a rule, every lexical word forms an AP, and no tonal events consistently mark syntactic constituents larger than a word but smaller than a clause, which obscures the correspondence between prosodic and syntactic structure (other than optional ip-formation). Skopeteas & Féry (2010) propose that broad-focus SOV utterances are phrased as (SOV) or (S)(OV), while SVO ones are phrased as (SV)(O), based on the derived syntactic status of the object. In contrast, Skopeteas & Féry (2016) argue for (SO)(V) as the default. Because a full account of the interaction between the syntactic and prosodic structures in Georgian is still to be established, no such account in focus contexts is offered here.

3.3 Prosodic realization of focus

The prosodic realization of focus in Georgian has received some attention in the literature. There is a consensus that foci may be associated with (rising-)falling F0 contours or may be realized with the default rising contour instead. The reports of what conditions the variation differ, and so do the available phonological accounts.

One of the earliest experimental studies that targeted the realization of focus in Georgian was conducted by Alkhazishvili (1959). He suggested that a Georgian sentence can be divided into “subject” and “predicate” phrases, which correspond to topic and focus/comment, respectively. The “predicate” includes the verb and the immediately preverbal focused constituent, while the “subject” includes any other material in a clause. Crucially, the two phrase types were shown to have different prosodic realizations. Within “subjects”, each word carried a rising contour, while “predicates” carried a (rising-)falling contour. Falling contour notwithstanding, Georgian speakers perceived “predicates” as more prominent than “subjects”.

Skopeteas, Féry, and Asatiani (2009) show that Georgian word order is conditioned by information structure, and narrowly focused constituents either occupy the immediately preverbal position, or may surface postverbally. Skopeteas and Féry (2010) investigate the prosody of SFVO, SOFV and SVOF, and show that, in all contexts, the narrowly focused constituent has significantly greater duration than the corresponding constituent in a string-identical broad-focus utterance. Though not tested for significance, there is a trend in their data that resembles the one discovered in the current study: on average, a narrowly focused subject is 53 ms longer than a subject in broad-focus conditions (335 ms vs. 282 ms), while this difference is smaller for narrowly focused preverbal objects (35 ms; 314 ms vs. 279 ms). With respect to F0, objects in SOFV contexts were found to have the same rising F0 contour as objects in broad focus; this was also replicated in the current study.11 In contrast, subjects in both SOV and SFVO conditions were found to have both rising and falling contours, with no obvious conditioning factor. These results were not corroborated in the current study, where the falling contour on narrowly focused subjects contrasted with the rising one on subjects in broad focus. Vicenik and Jun’s (2014) findings are somewhat different. In their data, most focused constituents carried a (rising-)falling contour, analyzed as an (L)H* pitch accent and an La boundary tone. Some focused objects, in addition, were realized with the usual declarative contour, L* Ha, which is in accord with Skopeteas and Féry (2010) and the current findings.

In a phonological analysis, Skopeteas & Féry (2010) conclude that preverbal narrowly focused subjects are separated from the following material, (SF)(VO), based on the frequent absence of a high boundary tone between V and O (but not the realization of SF itself). Preverbal narrowly focused objects are taken to be prosodically grouped with the verb, (S)(OFV), based on the low height of the boundary between S and O. In contrast, Asatiani & Skopeteas (2012) conclude that all preverbal focused constituents are prosodically grouped with the verb, and separated by a prosodic boundary from the preceding material, X(YFV), based on the presence of a prosodic boundary to the left of the focused constituent. The same conclusion is reached by Vicenik and Jun (2014). Finally, Skopeteas and Féry (2016) reanalyze some of their earlier findings: they conclude that all foci are separated from the rest of the clause when clause-initial and in the presence of postverbal material, (SF)(VO) and (OF)(VS), and integrated with the preceding material if clause-medial and in the absence of postverbal material, (SOF)(V) and (OSF)(V). The conflicting phonological analyses of phrasing are mainly driven by theoretical considerations, and stem from selecting different acoustic cues as decisive for marking prosodic boundaries. The current study does not contribute to this debate, and instead adopts Vicenik and Jun’s (2014) approach to prosodic phrasing (see Section 3.2).

One of the reasons for the reported variability in the data (which gave rise to incompatible phonological accounts) is likely to be methodological: the existing studies are based on ‘lab speech’ instead of (semi-)spontaneous production. Skopeteas, Féry, and Asatiani (2009: 110) used read speech, and instructed participants “to put emphasis on the information under question”. Skopeteas and Féry (2010; 2014; 2016) asked the participants to memorize the responses to the experimental questions. This is problematic, because the prosodic characteristics of e.g. read speech differ considerably from those of spontaneously produced speech (Lieberman et al. 1985; Howell & Kadi-Hanifi 1991; Nakamura, Iwano & Furui 2008). The conclusions drawn from lab-produced speech may not be applicable to (semi-)spontaneous speech, the mode of language production that is, arguably, most used by speakers. Methodological differences are likely to contribute significantly to the differences between the current results and those reported in previous studies. These two issues – the reported variability in the data and the need for more naturally produced data – provided further motivation for the current study.

3.4 Predictions for Georgian

According to the DF Hypothesis, a focus-projection pattern in a phrase language like Georgian would rely on a language-specific distribution of prosodic prominence – more specifically, a boundary tone-based alternative to prosodic prominence (Féry 2013). In Georgian, this is the distribution of AP-final boundary tones. A focus-projection pattern would be exhibited by a narrowly focused object and an object in broad focus that are identical with respect to the tonal contour and the type and height of the boundary tone. Given that there is a consensus that, in broad focus, the object carries the default L* Ha contour, the DF Hypothesis predicts that L* Ha would appear on narrowly focused objects too, with the same parameters of Ha. On the other hand, if a narrowly focused subject and a subject in broad focus carry different tonal contours and boundary tones, this would also be consistent with focus projection. Subjects in broad focus are predicted to carry the default L* Ha, but the realization of narrow focus on subjects is hard to predict: based on the literature, they may carry (L)H* La or L* Ha. In turn, according to the FP Hypothesis, languages that do not utilize pitch accents to mark prominence should not exhibit focus-projection patterns. Based on what we know about the prosody of Georgian, the FP Hypothesis predicts that F-projection approaches would have nothing to say about Georgian, since whatever patterns can be found there would fall outside the purview of the F-marking approaches.

The F0-based prosodic parameters investigated here, therefore, are tonal targets (boundary tones and combinations of boundary tones with pitch accents) on Georgian objects and subjects, when narrowly focused and in broad focus. Specifically, the identity of these tonal targets (as part of the global F0 contour) and the relative height of boundary tones are investigated. The choice of these parameters is motivated by the phrase-language status of Georgian. The evidence from typologically similar Hindi, introduced in Section 2.3, confirms that these F0 parameters can create focus-projection patterns.

The other prosodic parameter investigated here is the duration of the stressed syllable. An increase in duration of the stressed syllable as a cue for narrow focus has been documented for English (Xu & Xu 2005), Dutch (Cambier-Langeveld & Turk 1999), Swedish (Heldner & Strangert 2001), German (Baumann et al. 2007; Kügler & Genzel 2009), and Arabic (De Jong & Zawaydeh 2002), among others. The results from Hindi, a phrase language similar to Georgian, are incomplete: duration of stressed syllables/vowels is greater in contrastively focused words than in words in broad focus (Genzel & Kügler 2010; Puri 2013), but no information is available for non-contrastively focused constituents. In Georgian, a phrase language with non-phonemic stress, stressed syllables are, nevertheless, reliably cued by duration (Borise 2019). This gives rise to the following question: does duration of stressed syllables participate in creating focus-projection patterns in Georgian (i.e., differentiate narrowly focused subjects, but not objects, from their counterparts in broad-focus contexts)? But before this question can be answered, a more general question needs to be resolved: does duration of stressed syllables increase under (contrastive and non-contrastive) narrow focus in Georgian?

Whether the duration of the stressed syllable exhibits systematic focus-projection patterns has not been tested. But, under the Default Prosody approach, it is reasonable to hypothesize that, in the absence of prominence-marking by pitch accents, and with boundary tones only serving as an alternative to prominence-marking, other language-specific cues may participate in marking prominence. Consequently, they also may participate in focus projection. Given that duration cues stress in Georgian, it is a good candidate for also marking phrasal prominence. A pattern consistent with focus projection would then be one where the duration of the stressed syllable does not differ between narrowly focused objects and those in broad focus, whereas the stressed syllable of a narrowly focused subject has greater duration than that in a subject in broad focus.12

Duration of the stressed syllable (as opposed to e.g., intensity, vowel quality, etc.) was selected for investigation for several reasons. First, it has been shown for Georgian that duration marks prosodic prominence (stress). Second, duration is a consistent and robust cue, commonly investigated when parameters of individual syllables (as opposed to full words) are compared. Finally, targeting the duration of the stressed syllable will allow for evaluating Georgian in the context of languages above in which the duration (but not other parameters) of the stressed syllable in different focus contexts has been investigated.

The acoustic parameters, investigated on nouns in OV and SV contexts, are schematized in Table 2. ‘X* Xa’ in Table 2 refers to the combination of a pitch accent and boundary tone. The narrow focus condition, in the discussion of the results, is further subdivided into contrastive (e.g., SCF) and non-contrastive (e.g., SF) focus.

Table 2
Table 2

The set-up of the acoustic parameters tested.

4 Methods

4.1 Stimuli

The stimuli were designed in such a way as to capture possible prosodic and syntactic (i.e., word order) variability between different focus types.13 Thirty verbs (14 transitive, 9 unergative, and 7 unaccusative) were used in the study; based on the verbs, thirty scenarios were created.14 Personal names and common nouns (Mariami, Giorgi; a fisherman, children, etc.) were used as subjects and objects. To facilitate F0 analysis, lexical items containing no/few voiceless segments were used. However, naturalness of the stimuli was taken to be no less important, and some better-fitting lexical items with voiceless segments were chosen over fully voiced counterparts that were a poorer contextual fit. A sample scenario is provided in (15):

    1. (15)
    1. Mebadur-ma
    2. fisherman-erg
    1. da-i-tʃ’ir-a
    2. prv-ver-catch-aor.3sg
    1. zvigen-i
    2. shark-nom
    1. ʃarʃan
    2. last_year
    1. zapxul-ʃi.
    2. summer-loc
    1. ‘The fisherman caught a shark last summer.’

For each scenario, five questions were constructed, aimed at eliciting broad focus, non-contrastive (new-information) focus on the object, subject, and the VP, and contrastive focus on one of the constituents (subject, object, or the verb),15 as in (16). A sample picture prompt is provided in Figure 3; please see the Appendix for further information on the stimuli.

Figure 3
Figure 3

Sample picture prompt.

    1. (16)
    1. a.
    1. Ra
    2. what
    1. mo-xd-a
    2. prv-happen-aor.3sg
    1. ʃarʃan
    2. last_year
    1. zapxul-ʃi?
    2. summer-loc
    1. ‘What happened last summer?’
    1. b.
    1. Ra
    2. what
    1. da-i-tʃ’ir-a
    2. prv-ver-catch-aor.3sg
    1. mebadur-ma
    2. fisherman-erg
    1. ʃarʃan
    2. last_year
    1. zapxul-ʃi?
    2. summer-loc
    1. ‘What did the fisherman catch last summer?’
    1. c.
    1. Vin
    2. what
    1. da-i-tʃ’ir-a
    2. prv-ver-catch-aor.3sg
    1. zvigen-i
    2. shark-nom
    1. ʃarʃan
    2. last_year
    1. zapxul-ʃi?
    2. summer-loc
    1. ‘Who caught a shark last summer?’
    1. d.
    1. Ra
    2. what
    1. ga-a-k’et-a
    2. prv-ver-do-aor.3sg
    1. mebadur-ma
    2. fisherman-erg
    1. ʃarʃan
    2. last_year
    1. zapxul-ʃi?
    2. summer-loc
    1. ‘What did the fisherman do last summer?’
    1. e.
    1. Rvapexa
    2. octopus.nom
    1. da-i-tʃ’ir-a
    2. prv-ver-catch-aor.3sg
    1. mebadur-ma
    2. fisherman-erg
    1. ʃarʃan
    2. last_year
    1. zapxul-ʃi?
    2. summer-loc
    1. ‘Did the fisherman catch an octopus last summer?’

4.2 Procedure and participants

Participants were presented with picture prompts on a laptop screen. Each prompt was accompanied by a question and a recording of a native speaker of Georgian asking it, to make responding more natural. The participants were asked to answer the question based on the picture. They were instructed to speak clearly and use natural intonation but were not asked to provide full-sentence replies or use the same word order as in the question. The semi-spontaneous design allowed the participants to have a degree of freedom in their responses while maintaining some control over the lexical items.

Eight native speakers of Georgian participated in the study: two males (M1, M2) and six females (F1-F6), with the age range 20–35 y.o, mean age 26.8 y.o. All speakers were from Tbilisi and had a complete or in-progress university degree. The recordings were performed in Tbilisi, Georgia, using a Shure SM10A (head-worn, close-range) microphone and a Zoom H4n recorder. All data were recorded at a sampling rate of 44.100 Hz and 16 bits per sample. Data from all subjects were included in the analysis.

The set of questions equaled 134 (14 transitive × 5 + 9 unergative × 4 + 7 unaccusative × 4), but only 110 questions were used: before the experiment, two native speakers of Georgian assessed the naturalness of the pre-recorded questions and rejected 24 as pronounced unnaturally; they were removed from the experiment. No filler sentences were used. The final set was pseudo-randomized to ensure that items belonging to the same scenario or focus type are not adjacent. Each speaker provided replies to 110 questions, each uttered once. Recording sessions took ca. 30–35 minutes. After eliminating disfluent replies (due to pauses, errors, repetitions, throat clearing etc.), 817 replies were left.

4.3 Data processing and dataset compilation

The data were manually annotated in Praat (Boersma & Weenink 2021) by trained research assistants, based on the segmentation criteria in Machač & Skarnitzl (2009), and checked by the author.16 Duration of the stressed (initial) syllable and the F0 properties of the narrowly focused constituent (F0 contour, height of final boundary tone) were measured with a Praat script based on Elvira-García (2014).

An important novelty of this study is that the participants constructed their own responses. As a result, though, there was substantial variability in sentence structures used, which led to the dataset used for the analysis being smaller than the total number of responses. Given that no previous studies on focus in Georgian used semi-spontaneous speech, the non-scripted nature of the data was taken to be of primary importance, even though that led to the exclusion of many responses as not directly comparable. Eliminated utterances included single-word replies (single focused constituents or single verbs; total n = 162), complex paraphrases (n = 86), utterances with nouns modified by adjectives/demonstratives (n = 67), and responses with nouns substituted by pronouns (n = 46). Variability in focus placement also led to the exclusion of some responses. Non-contrastively focused objects appeared both preverbally and postverbally with almost equal frequency, but contrastively focused objects were placed exclusively postverbally.17 Utterances with postverbal foci of both types (total n = 46) were excluded from the dataset.18 Subjects, both non-contrastively and contrastively focused ones, were found exclusively preverbally.19

The remaining data were divided into two datasets: utterances that contain SV and OV strings, respectively, with different focus properties (henceforth ‘subject dataset’ and ‘object dataset’). This way, preverbal narrowly focused subjects and objects were compared to subjects and objects in broad-focus contexts, respectively. Since the data for VP-focus was available, and VP-focus takes part in focus projection, it was compared to object- and broad focus in the object dataset as well. Table 3 provides a summary of the clause types from which the SV/OV strings were extracted. The presence or absence of other material before or after the SV/OV string did not affect the acoustic parameters of interest. In particular, within each focus type, there were no significant differences between the F0 or durational values in utterance-initial and non-initial stimuli. For this reason, all SV/OV contexts were investigated together.

Table 3

Clause types from which SV and OV strings (underscored) were extracted; X = temporal/locative adjunct, Neg = sentential negation.

Subject dataset Object dataset
SV 62 SXOV 42
XSV 49 SOV 28
SVO 48 XSOV 13
SVX 34 OV 7
XSVO 28 Total 90
NegSVO 13
NegSV 7
Total 306

4.4. Analysis

The analysis of F0 properties had two objectives with respect to narrowly focused constituents and their counterparts in broad focus: (i) a global, qualitative analysis of F0 targets and (ii) a quantitative comparison of the height of the boundary tones. The (i) global analysis of the F0 values was performed on SV strings extracted from the most frequent word orders in the subject dataset (SV, XSV, and SVO). They contained only SV strings in broad focus (n = 83) and with narrow focus on the subject (non-contrastive, n = 73; contrastive, n = 3). In the object dataset, all OV strings in broad focus (n = 25) and with narrow focus on the object (n = 24) were analyzed. The tonal events in the responses were manually annotated by the author. The quantitative analysis targeted boundary tones on subjects and objects in the subject and object datasets, respectively.

Because Georgian carries F0 targets on both the right and left edges of an AP, words of different syllable counts could not be directly compared to each other. To enable the (ii) quantitative comparison, only syllables that have the potential to carry F0 targets were measured: the initial syllable, penult, and ultima, which can carry the pitch accent, optional phrase accent, and boundary tone, respectively.20 Consequently, for the purposes of F0 analysis, each word in the SV/OV strings was reduced to the initial syllable (coded ‘1’), penult (‘–2’), and ultima (‘–1’) of the original word. This means that, in words of more than three syllables, medial syllables were discarded from the analysis, and from the figures in Section 5.1.2. This is illustrated in (17) for an SV string. F0 properties of syllables S–1 and O–1 in SV/OV strings, respectively, are the most informative ones, because they are the ones that carry the final boundary tone. Accordingly, they were selected for the statistical analysis of boundary tone height.

    1. (17)
    1. S1
    2. Ma |
    2. ri |
    1. S–2
    2. a |
    1. S–1
    2. mi
    1. V1
    2. sa |
    2. di |
    1. V–2
    2. lob |
    1. V–1
    2. da
    1. Mariami
    2. ‘Mariami
    1. sadilobda.
    2. dined.’

Because the stimuli had to be at least three syllables long, to ensure comparability, 20 mono- and disyllabic words were discounted from the object dataset.21 The stimuli counts are provided in Table 4; for a break-down by focus type, see Table 8.22

Table 4

Stimuli counts in the boundary tone investigation.

String Test word n
SV subject 306
OV object 70

For the analysis of stressed syllable duration, initial syllables in nouns in the subject and object datasets were compared. Mono- and disyllabic stimuli were not discarded, but Syllable Count was included as a random factor into the linear mixed-effects model (see Section 5.2). Table 5 provides the counts of items used for duration measurements; for a break-down by focus type, see Table 12.

Table 5

Stimuli counts in the investigation of stressed syllable duration.

String Test word n
SV subject 306
OV object 90

5 Results

5.1 F0 properties

5.1.1 Global F0 contour

In the subject dataset, broad-focus utterances (SV, XSV, and SVO) carried the typical ‘default’ F0 contour. Each lexical word formed an AP: utterance-initial and -medial APs received an L* Ha or, rarely, (L)H* La realization, and final ones an L* L% realization. In line with the literature (Skopeteas & Féry 2016), the L* Ha/ (L)H* La variation utterance-initially seems semantically vacuous. All subjects in broad focus carried the L* Ha contour (n = 83; boldfaced in Table 6). This is illustrated in (18) and Figure 4.

Table 6

Tonal realizations of subjects in broad-focus utterances vs. utterances with subject focus (non-contrastive and contrastive).

Tonal events S, broad focus SF SCF
L* Ha 83 8 0
H* La 0 58 2
LH* La 0 7 1
Figure 4
Figure 4

Tonal realization of a broad-focus utterance (subject dataset).

In contrast, most narrowly focused subjects (total n = 68/76; boldfaced in Table 6) received the (L)H* La realization. This is illustrated in (18) and Figure 5. Note that the verb carries a shallow falling contour, analyzed as H* (H+)L L%, containing a phrase accent (H+)L, anchored to (antepenult+)penult, in accordance with Vicenik & Jun (2014) and Borise (2017). The H*/LH* variation was found in both contrastive and non-contrastive subject foci and did not seem meaningful. The summary of the tonal realizations of subjects is provided in Table 6.

Figure 5
Figure 5

Tonal realization of a subject-focus utterance (subject dataset).

    1. (18)
    1. (‘What happened last night?’/‘Who had dinner last night?’)
    1. Gasul
    2. last
    1. ɣame-s
    2. night-dat
    1. Mariami/MariamiF
    2. Mariami.nom
    1. sadil-ob-d-a.
    2. dine-sf-sm-ipfv.3sg
    1. ‘Mariami/MariamiF had dinner last night.’

In the object dataset, the realization of broad-focus utterances (SXOV, SOV, XSOV, and OV) adhered to the same generalizations as that described for broad-focus sentences in the subject dataset. It is exemplified in (19) and Figure 6. In contrast with subject foci, however, the majority of object foci (18/24; boldfaced in Table 7) received the same realization as objects in broad focus, L* Ha; this is shown in (19) and Figure 7. There was more variability in the object dataset, though, with 6 focused objects and, surprisingly, 4 objects in broad-focus contexts carrying the contour typical of subject foci, H* La. It should be noted that 3/6 of these contours in the narrow-focus condition and 2/4 in the broad-focus condition come from the same speaker, F4. The summary of the tonal realizations of objects is provided in Table 7.

Figure 6
Figure 6

Tonal realization of a broad-focus utterance (object dataset).

Table 7

Tonal realizations of objects in broad focus utterances vs. utterances with object focus.

Tonal events O, broad OF
L* Ha 21 18
H* La 4 6
LH* La 0 0
Figure 7
Figure 7

Tonal realization of an object-focus utterance (object dataset).

    1. (19)
    1. (‘What did grandma clean yesterday morning?’/‘What happened yesterday morning?’)
    1. Bebia
    2. grandma.nom
    1. guʃin
    2. yesterday
    1. dilas
    2. morning
    1. samzareulo-sF/samzareulo-s
    2. kitchen-dat
    1. a-lag-eb-d-a.
    2. ver-clean-sf-sm-ipfv.3sg
    1. ‘Grandma cleaned the kitchenF/kitchen yesterday morning.’

5.1.2 Final boundary tone (Ha/La)

A summary of the F0 realizations of subject-final syllables, S–1, in the subject dataset is provided in Table 8 and Figure 8. The F0 values are converted into semitones (st), given that the participants included males and females, and calculated with reference to the mean F0 of each speaker. In addition to the main object of comparison – the values for broad-focus subjects and narrowly focused ones (contrastive and non-contrastive) – they also include the values for subjects in other contexts for which the data were available (contrastive and non-contrastive object focus, and VP-focus). Here, the mean values of S–1 are considerably lower in contrastively and non-contrastively focused subjects than in subjects in broad focus (and other contexts).

Table 8

Summary of the F0 values of the subject-final syllable (S–1) in the subject dataset, by focus type.

Focus type n Mean F0 (S–1), st SD (S–1), st
SF 95 0.449 2.735
SCF 34 –0.74 2.535
broad 83 2.404 2.338
VPF 73 2.477 3.151
OF 14 2.089 2.508
OCF 7 4.136 1.043
Figure 8
Figure 8

F0 values of subject-final syllables (S–1) in the subject dataset, by focus type.

Figure 9 shows averaged F0 contours over SV strings in the subject dataset. It also demonstrates that the boundary tone on S–1 is considerably lower on contrastively and non-contrastively focused subjects than on subjects in broad focus (and other contexts). The qualitative difference in F0 shapes that span narrowly focused subjects and those in broad-focus utterances is also apparent: the former carry a falling F0 contour, and the latter have a rising one. To allow for better visualization, the continuation of the F0 contour on the following verb is also included.

Figure 9
Figure 9

Averaged F0 contours in SV strings in the subject dataset, smoothed at 0.2. On the x-axis, ticks mark syllable onsets.

The quantitative analysis of the S–1 F0 values was performed using a linear mixed-effects model, with the lmer function in the lme4 package (Bates et al. 2015) for R (R Core Team 2020); the lmerTest package (Kuznetsova, Brockhoff & Christensen 2017), which calculates Satterthwaite’s approximation to degrees of freedom, was used to calculate p-values. In each of the subject and object datasets, a model with the dependent variable Mean F0, fixed factor Focus Type, and random factors Speaker, Item, and Clause Subtype (such as, e.g. SVO, SV, XSV for subjects) was run (with a random intercept for each predictor but no random slopes). The broad-focus condition acted as the intercept. The results in Table 9 show that the S–1 F0 values in subject focus contexts, both non-contrastive and contrastive, are significantly lower than those of preverbal subjects in broad-focus utterances. None of the other contexts significantly differ from the intercept (broad focus).

Table 9

Statistical results for F0 (st) of subject-final syllables (S–1). Asterisks mark levels of significance: p < 0.001(***).

Focus type p-value β SE t
SF <0.001*** –2.052 0.368 –5.576
SCF <0.001*** –2.996 0.479 –6.252
broad (intercept) <0.001*** 2.489 0.39 6.376
VPF 0.76 –0.121 0.396 –0.306
OF 0.695 –0.286 0.73 –0.39
OCF 0.281 1.127 1.043 1.081

Turning to the object dataset, Table 10 and Figure 10 summarize the facts of F0 realization of object-final syllables, O–1. Note that contrastively focused objects were attested exclusively postverbally and are not considered here. Similarly, subject focus is not included, since all focused subjects in the dataset were preverbal – that is, no subject-focus OV strings (e.g., OVSF) were attested.

Table 10

Summary of the F0 values of the object-final syllable (O–1) in the object dataset, by focus type.

Focus type n Mean (O1), st SE (O1), st
OF 20 0.77 2.501
broad 20 1.985 2.029
VPF 30 1.58 2.149
Figure 10
Figure 10

F0 values of object-final syllables (O–1) in the object dataset, by focus type.

Figure 11 shows average F0 values over OV strings in the object dataset. Here, all three conditions considered closely align in their prosodic realization, including the height of the final rise on O–1.

Figure 11
Figure 11

Averaged F0 contours in OV strings in the object dataset, smoothed at 0.2. On the x-axis, ticks mark syllable onsets.

The same set-up of a linear mixed-effects model as described for the subject dataset was used on the object dataset. Additionally, a model with VP-focus as the intercept was run, given that focus projection accounts also make predictions for VP-focus. The results in Table 11 show that neither model detected a significant difference between the F0 values on O–1 in the different focus conditions.

Table 11

Statistical results for F0 (st) of object-final syllables (O–1). Asterisks mark levels of significance: p < 0.05 (*).

Focus type p-value β SE t
Intercept: broad
OF 0.219 –1.243 0.979 –1.27
broad 0.049* 1.535 0.722 2.127
VPF 0.076 –1.747 0.933 –1.873
Intercept: VPF
OF 0.62 0.503 1.002 0.503
broad 0.076 1.747 0.932 1.873
VPF 0.777 –0.211 0.738 –0.287

To recap, narrowly focused subjects, contrastive and non-contrastive, are realized markedly differently from subjects in broad-focus contexts. They carry a falling F0 contour, as opposed to a rising contour typical of constituents in broad-focus utterances. In contrast, narrowly focused objects and those in broad and VP-focus contexts carry the same default rising contour.

5.2 Duration of the stressed (initial) syllable

Table 12 and Figure 12 summarize the durational properties of initial syllables in subjects in the subject dataset. The results show that the stressed syllable receives extra duration in narrowly focused subjects (contrastive and non-contrastive), as compared to subjects in broad focus.

Table 12

Summary of the duration values of the stressed (initial) syllable in subjects in the subject dataset.

Focus type n Duration (σ́), ms SE (σ́), ms
SF 95 192 62
SCF 34 216 71
broad 83 181 61
VPF 73 160 45
OF 14 161 45
OCF 7 134 47
Figure 12
Figure 12

Duration of the stressed (initial) syllable in subjects in the subject dataset, by focus type.

Statistical analysis of the duration data in the subject dataset was performed with a linear mixed-effects model, using the lmer function in the lme4 package (Bates et al. 2015) and the lmerTest package (Kuznetsova et al. 2017) for R (R Core Team 2020). Duration of the initial syllable acted as the dependent variable, Focus Type as the fixed factor, and Speaker, Item, Clause Subtype and Syllable Count as random factors. Syllable Count was included to account for the fact that nouns in the subject and object datasets were of variable syllable counts, and the duration of individual syllables decreases as the syllable count increases (polysyllabic shortening; Lehiste 1972). In the subject dataset, the broad-focus condition acted as the intercept. The results in Table 13 show that the duration of the stressed syllable was significantly greater in narrowly (non-contrastively and contrastively) focused subjects than in subjects in broad-focus utterances. No significant difference between subjects in broad focus and in other contexts was detected.

Table 13

Statistical results for duration of the stressed syllable in subjects in the subject dataset. Asterisks are used to mark levels of significance: p < 0.05 (*), p < 0.01 (**).

Focus type p-value β SE t
SF 0.046* 12.885 6.413 2.009
SCF 0.006** 24.558 8.875 2.767
broad (intercept) 0.015* 183.24 17.197 10.655
VPF 0.463 –4.959 6.743 –0.735
OF 0.855 2.253 12.320 0.183
OCF 0.442 –13.452 17.462 –0.77

Turning to objects, Table 14 and Figure 13 provide a summary of the durational values of the stressed syllables in objects in the object dataset. The results show that the initial syllable in narrowly focused objects does not receive extra lengthening, as compared to objects in broad or VP-focus.

Table 14

Summary of the duration values of the stressed (initial) syllable in objects in the object dataset.

Focus type n Duration (σ́), ms SE (σ́), ms
OF 24 225 45
broad 25 236 68
VPF 41 254 98
Figure 13
Figure 13

Duration of the stressed (initial) syllable in objects; one outlier, with the value > 700 ms (VP-focus) is not shown.

The same linear mixed-effects model as described for the subject dataset was used on the object dataset. A model with VP-focus as the intercept was run as well. The results in Table 15 show that neither model detected a significant difference between the durations of stressed syllables in the three conditions.

Table 15

Statistical results for duration of the stressed syllable in objects in the object dataset. Asterisks are used to mark levels of significance: p < 0.01 (**).

Focus type p-value β SE t
Intercept: broad
OF 0.435 –11.725 14.955 –0.784
broad 0.006** 283.072 52.294 5.413
VPF 0.502 8.673 12.855 0.675
Intercept: VPF
OF 0.122 –20.414 13.055 –1.564
broad 0.517 –8.161 12.538 –0.651
VPF 0.006** 294.787 13.055 –1.564

To recap, in narrowly focused subjects, greater duration of the stressed syllable was found, as compared to broad-focus contexts. This was not the case with narrowly focused objects: here, the duration of the initial syllable did not exceed its counterpart in broad or VP-focus.

5.3 Summary

The analysis of the acoustic parameters of narrowly focused subjects and objects and those of subjects and objects in broad focus yielded the following results. Most focused subjects received a falling F0 contour, identified as (L)H* La, while all subjects in broad focus carried a rising contour, L* Ha. In contrast, most objects, narrowly focused and found in broad and VP-focus contexts, received the rising F0 contour, L* Ha. These findings were corroborated by a quantitative analysis of the boundary tone height. Narrowly focused subjects (contrastive and non-contrastive) had significantly lower boundary tone values than subjects in broad focus. The F0 values were not different for objects in any of the contexts. The duration of the stressed syllable in subjects was found to be greater under narrow focus, as compared to the broad-focus condition. This was not the case for objects, where the duration of the stressed syllable did not vary significantly across focus conditions. The results are schematized in Table 16.

Table 16
Table 16

Summary of the results.

6 Discussion

In better-studied intonation languages like English, the focus-projection facts are as follows. Narrowly focused objects and those in broad focus have the same prosodic realization (they carry the nuclear pitch accent), whereas narrowly focused subjects and those in broad focus differ (they carry the nuclear vs. prenuclear pitch accent). In English, therefore, focus projection is signaled with the distribution of pitch accents. This aligns well with the status of English as an intonation language – i.e., one in which pitch accents signal prosodic and information-structural contrasts.

This study shows that, in Georgian, subject-focus contexts are also marked as different from broad-focus ones, but object-focus contexts do not differ from broad-focus ones. The difference, however, is signaled with the help of different acoustic parameters. These are the properties (type and height) of the boundary tone and the duration of the stressed syllable, as shown in Table 16. These facts are consistent with the status of Georgian as a phrase language – i.e., one in which prosodic and information-structural contrasts are marked with acoustic cues other than pitch accents. Typically, these are boundary tones and phrasing (Féry 2017). Additionally, duration participates in prominence-marking, given that boundary tones only embody an alternative to prominence.

Neither acoustic parameter used by Georgian has been discussed as taking part in signaling focus projection before. These results, therefore, have implications for our understanding of focus projection and its workings in phrase languages. Specifically, they show that (i) focus-projection patterns exist in phrase languages and (ii) focus projection in phrase languages, in accordance with their definition, is signaled by acoustic cues other than pitch accents. In Georgian, these are boundary tones (or combinations of boundary tones with pitch accents), as well as the duration of stressed syllables. They are used to represent the same prosodic contrasts that intonation languages like English use pitch accents for. The Georgian results go hand in hand with the results for Hindi, a phrase language of the same subtype, where focus-marking facts are also consistent with focus projection: there, subject foci carry a sharper rise in F0 than subjects in broad focus, but objects in the corresponding contexts receive identical tonal realizations (Patil et al. 2008; Féry 2010; Féry et al. 2016).

Do the existing theoretical approaches to focus projection predict the Georgian (and Hindi) facts? The two hypotheses set up in Section 2.3 are repeated in (20) below. According to the DF Hypothesis, derived from the Default Prosody approaches to focus projection, languages that do not utilize pitch accents may resort to other acoustic means for marking prosodic prominence – and, accordingly, for creating focus-projection patterns. According to the FP Hypothesis, derived from the most influential F-projection approach to focus projection, Selkirk (1984; 1995), focus projection relies on F-marking, which arises from the distribution of prominence-marking pitch accents.

(20) a. DF Hypothesis:
Focus projection relies on language-specific means of prominence marking (e.g., prosodic boundaries) in languages that do not utilize pitch accents to mark prominence.
b. FP Hypothesis:
Focus projection is established with the help of pitch accents.

The Georgian (and Hindi) results are predicted by the DF Hypothesis but not the FP Hypothesis. Therefore, these results lend support to the Default Prosody approaches to focus projection as opposed to the F-projection ones. The fact that the Georgian and Hindi patterns exist, in turn, highlights the need to rethink the inventory of the phonetic and phonological phenomena that serve as the basis for the theoretical approaches to focus projection. Given the nature of phrase languages, pitch accents cannot be relied upon to create focus projection in them. Similarly, since there is no evidence for sentential stress/nuclear pitch accent in Georgian or Hindi, a focus-mapping principle cannot refer to it either. Neither can it refer directly to phrasing rules, formulated in terms of alignment with a prosodic boundary (Féry 2013), because the fundamental phrasing facts in Georgian broad and narrow foci stay the same: in all contexts, the preverbal noun and the verb form their own APs. It is only the value of the AP-boundary tone that is affected in subject focus (La instead of Ha), and the creation of a higher-level prosodic constituent, ip.

Instead, the Georgian data can be accounted for by keeping the key insight but modifying the individual components of the Default Prosody approaches, in line with the language-specific facts. In the absence of nuclear stress, the notion of Default Prominence in Georgian is only meaningful insofar as it describes the F0 contour on object APs (most deeply embedded constituents) in broad-focus declaratives, L* Ha. L* Ha is also found on narrowly focused objects. This is similar to the parallelism found in intonation languages, where broad-focus and object-focus contexts assign the nuclear pitch accent to the object. The rule of Default Prominence in Georgian is supplemented by a variant of a Focus Prominence rule. Because Georgian does not use pitch accents to mark prominence, and instead uses alignment with boundary tones as an alternative to prominence-marking, the Focus Prominence rule requires that the AP corresponding to the focused constituent carry an (L)H* La contour. The distinct property of Georgian is that the Default Prominence rule overrides the Focus Prominence rule, which makes narrowly focused objects receive the same realization as objects in broad focus. Because Default Prominence does not apply to subjects, they receive the realization required by the Focus Prominence rule.

The F-projection approach, in contrast, runs into problems with Georgian data – even if language-specific modifications are introduced. To make the F-projection approach work for Georgian, it would be necessary to divorce F-marking from pitch accents (since they do not cue prominence) and reduce it to an abstract theoretical ‘F-mark’ that is placed on focused constituents. F-marking would then not be dependent on the presence of the accent but would instead be fed to the phonological component of the grammar, where it can receive a language-specific phonetic realization of focus prominence. However, even with these accommodations, the F-projection approach fails to account for the fact that subject and object foci in Georgian receive different F0 realizations – (L)H* La and L* Ha, respectively – unlike in English. If both realizations followed from the presence of an F-mark, this difference would need to be accounted for by an additional ad-hoc rule.23 In contrast, the Default Prosody-style approach sketched above accounts for this difference via the primacy of Default Prominence, makes an attractive connection between broad focus and object focus, and relates focus projection in phrase languages to that in better-studied intonation languages. Owing to the difference in realization between subject and object foci, Georgian provides an argument in favor of the Default Prosody approaches that is unavailable in English.

7 Conclusion

Based on experimental evidence, this paper showed that the phenomenon of focus projection is found in Georgian. Georgian belongs to a language type, phrase languages, that is fundamentally different from the language type typically studied from the perspective of focus projection, intonation languages. In particular, phrase languages do not mark prosodic and information-structural contrasts with pitch accents, relying instead on prosodic boundaries and phrasing. The phenomenon of focus projection, in contrast, is commonly understood in terms of pitch accent distribution. Nevertheless, this paper shows that the prosodic realization of subject and object focus in Georgian fits with the focus-projection pattern. Two prosodic phenomena, duration of the stressed syllable and the parameters (type and height) of the final boundary tone, mark subject-focus contexts as distinct from broad-focus ones. Neither parameter differentiates object-focus contexts from broad-focus ones. These results are consistent with the hypothesis that languages that do not utilize pitch accents to mark prominence may use language-specific prosodic means of prominence marking (e.g., prosodic boundaries) to create focus-projection patterns. Default Prosody approaches, but not F-projection approaches, can account for the Georgian data.

Another contribution of this study is that it detects a contrast between the prosodic realization of subjects and objects in a language with a dedicated preverbal focus position – that is, it shows that focus projection (object-focus utterances realized like broad-focus ones, subject-focus utterances realized differently from broad-focus ones) holds even in a language where subject and object foci are found in the same linear position: immediately preverbal. These novel facts demonstrate that focus projection is not affected by the linear order and surface positions of arguments.

Some limitations of the current study should be pointed out. The object dataset was considerably smaller that the subject dataset, for two reasons: (i) contrastive objects were realized postverbally and excluded from the study, and (ii) only the subject dataset included intransitive verbs. Therefore, the fact that the statistical analysis did not reveal a significant difference between object-focus and broad-focus contexts does not necessarily mean that the two are identical. However, the qualitative analysis of the object data is very illustrative: it shows that, even in a limited sample, the majority of focused objects indeed pattern tonally with the objects in broad focus. The same was not the case for subjects. It is also possible that the difference between the prosodic realizations of broad focus and object focus is marked with acoustic cues other than those considered here, like intensity and duration of words and silences between them (Breen et al. 2010), the types and absolute values of other F0 targets (Skopeteas & Féry 2010), and the realization of the given parts of the utterance (Patil et al. 2008). Therefore, it is not implausible that the difference between the prosodic realizations of broad foci and object foci in Georgian may lie in these other factors, which remain to be investigated.

The syllable duration and F0 results reported here contrast with some earlier results for Georgian. Skopeteas and Féry (2010; 2016) report greater duration of the whole word and the stressed syllable to be a consistent correlate of narrow focus for both subjects and objects. There is also considerable variability with respect to the F0 results, both among the existing studies, as discussed in Section 3.3, and between those and the current one. Several factors are likely to contribute to the variability. The experimental set-up (read speech, memorized replies, semi-spontaneous replies) makes the responses more or less natural. The semi-spontaneous nature of the study may explain the more consistent realization of subject foci in the current study. Another factor that leads to variability in analyses is the choice of acoustic cues that are taken to signal prosodic constituency. Skopeteas & Féry (2016) aimed to bring together F0 targets, prosodic breaks, and phonation as evidence for prosodic phrasing. Here, in contrast, the presence of the final boundary tone was taken to signal phrasing, following Vicenik & Jun (2014). Because languages vary in the number and relative weight of cues for prosodic phrasing (e.g., Yang et al. 2014), and it is unclear how these cues interact in Georgian, a more conservative approach was adopted here.

Finally, an important question not considered here is the interaction between prosodic and syntactic phrasing in narrow focus contexts. At first sight, the Georgian facts seem unusual. Specifically, most narrowly focused subjects carry the falling contour, (L)H* La. According to Vicenik & Jun (2014), (L)H* La indicates that an AP has a close semantic/syntactic connection with the following AP, and the two APs form an ip. This means that narrowly focused subjects form an ip with the following verb. In contrast, most objects, narrowly focused ones and in broad focus, carry the default L* Ha contour. Therefore, the object and the verb do not form a prosodic constituent, which is unexpected from the perspective of syntax-prosody mapping (e.g. Selkirk 2011). This also seems counterintuitive, given that objects have a closer syntactic connection with verbs (structural sisterhood) than subjects: [XPOV] vs. (O)AP(V)AP, [YPS[XPV]] vs. ((S)AP(V)AP)ip. Such mismatches between prosodic and syntactic phrasing are not uncommon, and usually result from the syntax-prosody matching rules being overridden by independent phonological requirements (Selkirk 2011). In Georgian, in the absence of prominence marking by pitch accents, the way to make a focused subject more prominent is to alter the unmarked pattern of prosodic phrasing. The easiest way to do that is to change the value of a boundary tone that follows the narrowly focused subject, from Ha to La. This also results in ip-formation and creation of a closer connection between the narrowly focused subject and the verb, but the motivation for that is non-syntactic.

Supplementary files

Appendix. Test words used in the experiment. DOI: https://doi.org/10.16995/glossa.5733.xx


Glosses follow the Leipzig Glossing Rules. Additional glosses: aor – aorist, prv – preverb, sf – stem formant, sm – stem/screeve marker, ver – version marker.


  1. The term ‘focus projection’ (Focusprojektion) was introduced by Höhle (1982) to refer to focus-size ambiguity in nested foci. Selkirk (1984; 1995) introduced ‘Focus Projection’ as a technical term in a syntax-prosody analysis of the phenomenon. Here, ‘focus projection’ is used to refer to the phenomenon, and ‘Focus Projection’ is used in Selkirk’s technical sense. [^]
  2. This holds for transitive and unergative predicates, where the subject is the external argument, but not unaccusative predicates, where the subject is the internal argument and typically carries sentential stress in broad focus: What happened? – Truman has died (Schmerling 1976: 90). Throughout the paper, ‘subject’, with respect to the English examples, refers to the external argument. [^]
  3. For reasons of space, no detailed discussion of individual proposals is provided; see Arregi (2016) and references therein. [^]
  4. Some other F-marking approaches remain agnostic about the exact phonological nature of prosodic prominence that takes part in focus projection, and refer to it simply as ‘sentential stress’ or ‘intonation center’ (Rochemont 1986; von Stechow & Uhmann 1986). These approaches do not apply to phrase languages of the type discussed here, since there is no evidence for ‘sentential stress’/ ‘intonation center’ in Georgian (or Hindi). These approaches are not discussed further. [^]
  5. I thank an anonymous reviewer for bringing these facts to my attention. [^]
  6. Georgian also allows for postverbal foci, which differ from preverbal ones in their prosodic realization (Skopeteas & Féry 2010; Skopeteas, Féry & Asatiani 2018) and syntactic status (Borise 2019); certain types of foci (wh-phrases) cannot appear postverbally. This type of focus is not discussed here. [^]
  7. Other languages, like Hungarian, may derive preverbal focus placement via a Spec-Head configuration (É. Kiss 1998; for the prosodic facts, see Genzel, Ishihara & Surányi 2015). [^]
  8. A prominent alternative framework is Prosodic Phonology (Selkirk 1980; 1984; Nespor & Vogel 1986; Hayes 1989). AM is used here because the existing analyses of Georgian intonation are couched in it, but the same facts can be modelled in Prosodic Phonology. [^]
  9. The acoustic data used for illustrating the prosodic patterns of Georgian were collected during the author’s fieldwork in Georgia. [^]
  10. See Vicenik & Jun (2014) for discussion and illustrations. [^]
  11. The only small but significant difference between the two contexts was the steeper fall from the Ha of subject in SOFV than in SOV. [^]
  12. F0 properties of given material in the context of narrow focus (e.g., post-focal deaccenting) may also be relevant for focus-projection patterns. They were not considered here for two reasons. First, the nature and degree of post-focal deaccenting in Georgian has been debated (Skopeteas & Féry 2010; Vicenik & Jun 2014). Second, the semi-spontaneous study design led to there being substantial variability in the given material (e.g., it was produced in the right and/or left periphery, or dropped altogether), making a systematic analysis impossible. [^]
  13. For reasons of space, only the word order results that are directly relevant for the prosodic results are reported here. [^]
  14. No significant interaction between verb type (transitive, unergative, or unaccusative) and prosodic realization of focus (duration of the stressed syllable, pitch accent or boundary tone) or word order was subsequently detected. Therefore, all verbs are discussed together in the remainder of the paper. [^]
  15. Contrastive verb focus was meant to elicit a reply that corrects the verb (e.g. dance vs. sing). Because utterances with contrastive verb focus (n = 14) were verb-initial, unlike all other contexts, they were subsequently excluded from the analysis. [^]
  16. In Praat, the following settings were used: pitch range 75–500 Hz for females and 50–450 Hz for males, voicing threshold = 0.6, octave jump cost = 0.6. [^]
  17. Speakers F1 and F6 produced objects postverbally in all contexts (VO). Accordingly, none of their data were used in the object dataset. [^]
  18. Postverbal foci are syntactically and prosodically different from preverbal. For prosody of postverbal foci, see Skopeteas, Féry & Asatiani (2018). [^]
  19. It is unclear what these tendencies (no preverbal contrastively focused objects, no postverbal focused subjects) are due to. Georgian has a preference for preverbal placement of contrastive foci (Skopeteas & Fanselow 2010), and postverbal focused subjects occur in the elicitation setting. [^]
  20. This technique is borrowed from Skopeteas & Féry’s (2016) analysis. Note that they also included antepenults as potential tonal targets, but here they are discounted, given that no durational or F0 effects were found on antepenults in previous studies (Alkhazishvili 1959; Borise & Zientarski 2018; Borise 2019). [^]
  21. Mono- and disyllabic words were introduced into the responses by the participants. All intended lexical items, used in the prompts, were over three syllable long. [^]
  22. Several factors led to the numerical discrepancy between objects and subjects. First, only the subject dataset included intransitive sentences. Second, the participants placed focused objects preverbally or postverbally (postverbal foci were subsequently excluded from the analysis), but narrowly focused subjects occurred exclusively preverbally. The size of the object dataset raises the concern that it may be too small for potential differences between boundary tones in narrowly focused and broad-focus objects to be detected. However, the ‘global’ F0 data in Section 5.1.1 shows that, across contexts, objects consistently carry Ha of the same shape and height, which makes it not surprising that no significant differences were detected. [^]
  23. The difference between subject and object focus-marking is also found in Hindi, where narrowly focused subjects are marked by a higher final rise, while objects are not (Patil et al. 2008; Féry 2010; Féry, Pandey & Kentner 2016). [^]


I thank Byron Ahn, Karlos Arregi, Rusudan Asatiani, J. Terence Blaskovits, Jonathan Bobaljik, Dan Brodkin, Lauren Clemens, Marcel den Dikken, Gorka Elordieta, Katalin É. Kiss, Junko Ito, Keith Langston, Armin Mester, Léa Nash, Maria Polinsky, Margaret Renwick, Justin Royer, Kevin Ryan, Stavros Skopeteas, Balázs Surányi, Kriszta Szendrői, Michael Wagner, Danfeng Wu, and two anonymous reviewers for their helpful comments and suggestions on the earlier versions of this paper. Many thanks are due to the Georgian speakers who took part in the study, and colleagues in the US and Georgia who facilitated fieldwork. I am also grateful to Nastassia Kotava, who created the picture prompts, and to the research assistants who annotated the prosodic data. All remaining errors are my responsibility.

Funding information

This research was supported by grants NKFIH KKP 129921 and NKFIH K 135958 of the National Research, Development, and Innovation Office of Hungary. The fieldwork was funded by a Graduate Research Travel Grant from the Davis Center for Russian and Eurasian Studies at Harvard University.

Competing Interests

The author has no competing interests to declare.


Adger, David. 2007. Stress and phasal syntax. Linguistic Analysis 33. 238–266.

Alkhazishvili, Archil. 1959. Porjadok slov i intonacija v prostom povestvovateljnom predloženii gruzinskogo jazyka [Word order and intonation in simple declarative clauses in Georgian]. Fonetika 1. 367–414.

Aronson, Howard I. 1982. Georgian: a reading grammar. Chicago: University of Chicago.

Arregi, Karlos. 2002. Focus on Basque movements. Cambridge, MA: Massachusetts Institute of Technology dissertation.

Arregi, Karlos. 2016. Focus projection theories. In Féry, Caroline & Ishihara, Shinichiro (eds.), The Oxford Handbook of Information Structure, 185–202. Oxford; New York: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199642670.013.005

Asatiani, Rusudan & Skopeteas, Stavros. 2012. Information structure in Georgian. In Krifka, Manfred & Musan, Renate (eds.), The expression of information structure 5. 127–158. Berlin; Boston: De Gruyter. DOI:  http://doi.org/10.1515/9783110261608.127

Bansal, Ram Krishna. 1976. The intelligibility of Indian English 4. Hyderabad: CIEFL.

Bates, Douglas & Maechler, Martin & Bolker, Ben & Walker, Steve. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. DOI:  http://doi.org/10.18637/jss.v067.i01

Baumann, Stefan & Becker, Johannes & Grice, Martine & Mücke, Doris. 2007. Tonal and articulatory marking of focus in German. In Trouvain, Jürgen & Barry, William J. (eds.), International Congress of Phonetic Sciences (ICPhS) 16. 1029–1032.

Baumann, Stefan & Grice, Martine & Steindamm, Susanne. 2006. Prosodic marking of focus domains - categorical or gradient? Speech Prosody 3. 301–304.

Birch, Stacy & Clifton, Charles Jr., 1995. Focus, accent, and argument structure: Effects on language comprehension. Language and Speech 38(4). 365–391. DOI:  http://doi.org/10.1177/002383099503800403

Bishop, Jason. 2010. Information structural expectations in the perception of prosodic prominence. UCLA Working Papers in Phonetics 108. 203–225.

Boersma, Paul & Weenink, David. 2021. Praat: doing phonetics by computer [Computer program] (Version 6.1.41). Retrieved from http://www.praat.org/.

Bolinger, Dwight L. 1970. Relative height. In Leon, Pierre & Faure, Georges & Rigault, André (eds.), Prosodic feature analysis, 109–125. Montreal: Didier.

Borise, Lena. 2017. Prosody of focus in a language with a fixed focus position: Evidence from Georgian. West Coast conference on Formal Linguistics (WCCFL) 34. 89–96.

Borise, Lena. 2019. Phrasing is key: the syntax and prosody of focus in Georgian. Cambridge, MA: Harvard University dissertation.

Borise, Lena & Zientarski, Xavier. 2018. Word stress and phrase accent in Georgian. Tonal Aspects of Languages (TAL) 6. DOI:  http://doi.org/10.21437/TAL.2018-42

Breen, Mara & Fedorenko, Evelina & Wagner, Michael & Gibson, Edward. 2010. Acoustic correlates of information structure. Language and Cognitive Processes 25(7–9). 1044–1098. DOI:  http://doi.org/10.1080/01690965.2010.504378

Bresnan, Joan. 1971. Sentence stress and syntactic transformations. Language 47. 257–281. DOI:  http://doi.org/10.2307/412081

Bruce, Gösta. 1977. Swedish word accents in sentence perspective. Lund: Lund University dissertation.

Büring, Daniel. 2006. Focus projection and default prominence. In Molnár, Valeria & Winkler, Susanne (eds.), The architecture of focus, 321–346. Berlin: De Gruyter. DOI:  http://doi.org/10.1515/9783110922011.321

Bush, Ryan. 1999. Georgian yes-no question intonation. Phonology at Santa Cruz 6. 1–11.

Cambier-Langeveld, Tina & Turk, Alice E. 1999. A cross-linguistic study of accentual lengthening: Dutch vs. English. Journal of Phonetics 27(3). 255–280. DOI:  http://doi.org/10.1006/jpho.1999.0096

Chomsky, Noam & Halle, Morris. 1968. The sound pattern of English. New York: Harper & Row.

Cinque, Guglielmo. 1993. A null theory of phrase and compound stress. Linguistic Inquiry 24. 239–297.

Cohen, Antonie & ’t Hart, Johan. 1967. On the anatomy of intonation. Lingua 19. 177–192. DOI:  http://doi.org/10.1016/0024-3841(69)90118-1

Contreras, Heles. 1976. A theory of word order with special reference to Spanish. Amsterdam: North-Holland Publishing Company.

Culicover, Peter & Rochemont, Michael. 1983. Stress and focus in English. Language 59(1). 123–165. DOI:  http://doi.org/10.2307/414063

De Jong, Kenneth & Zawaydeh, Bushra. 2002. Comparing stress, lexical focus, and segmental focus: Patterns of variation in Arabic vowel duration. Journal of Phonetics 30(1). 53–75. DOI:  http://doi.org/10.1006/jpho.2001.0151

D’Imperio, Mariapaola. 1997. Breadth of focus, modality and prominence perception in Neapolitan Italian. The Ohio State University Working Papers in Linguistics 50. 19–39.

D’Imperio, Mariapaola. 2000. The role of perception in defining tonal targets and their alignment. Columbus, OH: The Ohio State University dissertation.

Dzidziguri, Shota. 1954. Dziebani kartuli dialekt’ologiidan [Studies in Georgian dialectology]. Tbilisi: Samecniero-metoduri k’abinet’is gamomcemloba.

É. Kiss, Katalin. 1998. Identificational focus versus information focus. Language 74(2). 245–273. DOI:  http://doi.org/10.1353/lan.1998.0211

Elvira-García, Wendy. 2014. Prosodic data extraction (Version 2.1). Retrieved from http://stel.ub.edu/labfon/sites/default/files/prosodic_data-extraction-2.0.praat.

Féry, Caroline. 2010. The intonation of Indian languages: An areal phenomenon. In Hasnain, Imtiaz & Chaudhurry, Shreesh (eds.), Problematizing language studies: Festschrift for Rama Kant Agnihotri, 288–312. New Delhi: Aakar Books.

Féry, Caroline. 2013. Focus as prosodic alignment. Natural Language & Linguistic Theory 31(3). 683–734. DOI:  http://doi.org/10.1007/s11049-013-9195-7

Féry, Caroline. 2017. Intonation and Prosodic Structure. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/9781139022064

Féry, Caroline & Pandey, Pramod & Kentner, Gerrit. 2016. The prosody of focus and givenness in Hindi and Indian English. Studies in Language 40(2). 302–339. DOI:  http://doi.org/10.1075/sl.40.2.02fer

Frota, Sónia. 2000. Prosody and focus in European Portuguese: Phonological phrasing and intonation. London; New York: Garland.

Frota, Sónia. 2012. A focus intonational morpheme in European Portuguese: Production and perception. In Elordieta, Gorka & Prieto, Pilar (eds.), Prosody and meaning, 163–196. Berlin; Boston: De Gruyter.

Genzel, Susanne & Ishihara, Shinichiro & Surányi, Balázs. 2015. The prosodic expression of focus, contrast and givenness: A production study of Hungarian. Lingua 165. 183–204. DOI:  http://doi.org/10.1016/j.lingua.2014.07.010

Genzel, Susanne & Kügler, Frank. 2010. The prosodic expression of contrast in Hindi. In Speech Prosody 5. 1–4.

Grice, Martine & Ladd, D. Robert & Arvaniti, Amalia. 2000. On the place of phrase accents in intonational phonology. Phonology 17(02). 143–185. DOI:  http://doi.org/10.1017/S0952675700003924

Gussenhoven, Carlos. 1983a. Focus, mode and nucleus. Journal of Linguistics 19(2). 377–417. DOI:  http://doi.org/10.1017/S0022226700007799

Gussenhoven, Carlos. 1983b. Testing the reality of focus domains. Language and Speech 26(1). 61–80. DOI:  http://doi.org/10.1177/002383098302600104

Harris, Alice C. 2000. Word order harmonies and word order change in Georgian. In Sornicola, Rosanna & Poppe, Erich & Shisha-Halevy, Ariel (eds.), Stability, variation and change of word order patterns over time, 133–163. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/cilt.213.13har

Hayes, Bruce Philip. 1989. Compensatory lengthening in moraic phonology. Linguistic Inquiry 20(2). 253–306.

Heldner, Mattias & Strangert, Eva. 2001. Temporal effects of focus in Swedish. Journal of Phonetics 29(3). 329–361. DOI:  http://doi.org/10.1006/jpho.2001.0143

Höhle, Tilmann N. 1982. Explikation für “normale Betonung” und “normale Wortstellung”. In Werner, Abraham (ed.), Satzglieder im Deutschen, 75–153. Tübingen: Narr.

Howell, Peter & Kadi-Hanifi, Karima. 1991. Comparison of prosodic properties between read and spontaneous speech material. Speech Communication 10(2). 163–169. DOI:  http://doi.org/10.1016/0167-6393(91)90039-V

Hualde, José Ignacio & Elordieta, Gorka & Elordieta, Arantzazu. 1994. The Basque dialect of Lekeitio. Bilbao: Universidad del País Vasco.

Jackendoff, Ray S. 1972. Semantic interpretation in generative grammar. Cambridge MA: MIT Press.

Jacobs, Joachim. 1988. Fokus-Hintergrund-Gliederung und Grammatik. In Altmann, Hans (ed.), Intonationsforschungen, 89–134. Tübingen: Niemeyer. DOI:  http://doi.org/10.1515/9783111358413.89

Jun, Sun-Ah. 2005. Prosodic typology. In Jun, Sun-Ah (ed.), Prosodic typology: The phonology of intonation and phrasing, 430–458. Oxford; New York: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199249633.003.0016

Jun, Sun-Ah & Fougeron, Cécile. 1995. The accentual phrase and the prosodic structure of French. In International Congress of Phonetic Sciences (ICPhS) 13 2. 722–725.

Kahnemuyipour, Arsalan. 2004. The syntax of sentential stress. Toronto: University of Toronto dissertation.

Kenesei, István & Vogel, Irene. 1989. Prosodic phonology in Hungarian. Acta Linguistica Hungarica 39(1/4). 149–193.

Kratzer, Angelika & Selkirk, Elisabeth. 2007. Phase theory and prosodic spellout: The case of verbs. The Linguistic Review 24(2–3). DOI:  http://doi.org/10.1515/TLR.2007.005

Kügler, Frank & Genzel, Susanne. 2009. Sentence length, position, and information structure on segmental duration in German. Ms. University of Potsdam.

Kuznetsova, Alexandra & Brockhoff, Per B. & Christensen, Rune H. B. 2017. lmerTest package: tests in linear mixed effects models. Journal of Statistical Software 82(13). DOI:  http://doi.org/10.18637/jss.v082.i13

Ladd, D. Robert. 1980. The structure of intonational meaning. Bloomington, IN: Indiana University Press.

Ladd, D. Robert. 1983. Phonological features of intonational peaks. Language 59(4). 721–759. DOI:  http://doi.org/10.2307/413371

Ladd, D. Robert. 1996. Intonational phonology. Cambridge: Cambridge University Press.

Legate, Julie Anne. 2003. Some interface properties of the phase. Linguistic Inquiry 34. 506–516. DOI:  http://doi.org/10.1162/ling.2003.34.3.506

Legate, Julie Anne. 2008. Morphological and abstract case. Linguistic Inquiry 39(1). 55–101. DOI:  http://doi.org/10.1162/ling.2008.39.1.55

Lehiste, Ilse. 1972. The timing of utterances and linguistic boundaries. The Journal of the Acoustical Society of America 51(6B). 2018–2024. DOI:  http://doi.org/10.1121/1.1913062

Liberman, Mark Y. 1975. The intonational system of English. Cambridge, MA: Massachusetts Institute of Technology dissertation.

Lieberman, Philip & Katz, William & Jongman, Allard & Zimmerman, Roger & Miller, Mark. 1985. Measures of the sentence intonation of read and spontaneous speech in American English. The Journal of the Acoustical Society of America 77(2). 649–657. DOI:  http://doi.org/10.1121/1.391883

Machač, Pavel & Skarnitzl, Radek. 2009. Principles of phonetic segmentation. Praha: Epocha.

Müller, Gabriele. 2005. Frageintonation im Georgischen. Cologne: University of Cologne dissertation.

Nakamura, Masanobu & Iwano, Koji & Furui, Sadaoki. 2008. Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance. Computer Speech & Language 22(2). 171–184. DOI:  http://doi.org/10.1016/j.csl.2007.07.003

Nash, Léa. 1995. Portée argumentale et marquage casuel dans les langues SOV et dans les langues ergatives: l’exemple du géorgien. Paris: Paris 8 dissertation.

Nash, Léa. 2017. The structural source of split ergativity and ergative case in Georgian. In Coon, Jessica & Massam, Diane & Travis, Lisa deMena (eds.), The Oxford Handbook of Ergativity, 175–200. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780198739371.013.8

Nespor, Marina & Vogel, Irene. 1986. Prosodic phonology. Berlin: De Gruyter Mouton.

Newman, Stanley S. 1946. On the stress system of English. Word 2(3). 171–187. DOI:  http://doi.org/10.1080/00437956.1946.11659290

Patil, Umesh & Kentner, Gerrit & Gollrad, Anja & Kügler, Frank & Féry, Caroline & Vasishth, Shravan. 2008. Focus, word order and intonation in Hindi. Journal of South Asian Linguistics 1(1). 55–72.

Pierrehumbert, Janet. 1980. The phonetics and phonology of English intonation. Cambridge, MA: Massachusetts Institute of Technology dissertation.

Pochkhua, Bidzina. 1962. Sit’q’vatganlagebisatvis kartulši [On word order in Georgian]. Iberiul-k’avk’asiuri Enatmecniereba [Iberian-Caucasian Linguistics] 13. 119–121.

Puri, Vandana. 2013. Intonation in Indian English and Hindi late and simultaneous bilinguals. Champaign: University of Illinois at Urbana-Champaign dissertation.

R Core Team. 2020. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/.

Reinhart, Tanya. 1995. Interface strategies (OTS working papers, TL-95-002), Utrecht University.

Robins, Robert H. & Waterson, Natalie. 1952. Notes on the phonetics of the Georgian word. Bulletin of the School of Oriental and African Studies 14(01). 55–72. DOI:  http://doi.org/10.1017/S0041977X00084196

Rochemont, Michael Shaun. 1986. Focus in generative grammar. Amsterdam; Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/sigla.4

Rump, Hans H. & Collier, Rene. 1996. Focus conditions and the prominence of pitch-accented syllables. Language and Speech 39(1). 1–17. DOI:  http://doi.org/10.1177/002383099603900101

Schmerling, Susan F. 1976. Aspects of English sentence stress. Austin, TX: University of Texas Press.

Schwarzschild, Roger. 1999. GIVENness, AvoidF and other Constraints on the Placement of Accent. Natural Language Semantics 7. 141–177. DOI:  http://doi.org/10.1023/A:1008370902407

Selkirk, Elisabeth. 1980. Prosodic domains in phonology: Sanskrit revisited. In Aronoff, Mark & Kean, Mary-Louise (eds.), Juncture: A collection of original papers, 107–129. Saratoga: Anma Libri.

Selkirk, Elisabeth. 1984. Phonology and syntax: the relation between sound and structure. Cambridge, MA: MIT Press.

Selkirk, Elisabeth. 1995. Sentence prosody: Intonation, stress, and phrasing. In John A. Goldsmith (ed.), The Handbook of Phonological Theory 1. 550–569. Cambridge, MA: Blackwell. DOI:  http://doi.org/10.1002/9781444343069.ch14

Selkirk, Elisabeth. 2011. The syntax-phonology interface. In Goldsmith, John A. & Riggle, Jason & Yu, Alan C. L. (eds.), The Handbook of Phonological Theory 2. 435–483. Hoboken, NJ: Wiley Blackwell.

Selmer, Ernst W. 1935. Georgische Experimentalstudien. In Avhandlinger utgitt av det Norske Videnskaps-Akademi i Oslo. II. Historisk-filosofisk klasse, 1–55. Oslo: I Kommisjon Hos Jacob Dybwad.

Shattuck-Hufnagel, Stefanie & Turk, Alice E. 1996. A prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research 25(2). 193–247. DOI:  http://doi.org/10.1007/BF01708572

Skopeteas, Stavros & Fanselow, Gisbert. 2010. Focus in Georgian and the expression of contrast. Lingua 120(6). 1370–1391. DOI:  http://doi.org/10.1016/j.lingua.2008.10.012

Skopeteas, Stavros & Féry, Caroline. 2010. Effect of narrow focus on tonal realization in Georgian. Proceedings of Speech Prosody 5. Retrieved from https://www.isca-speech.org/archive_v0/sp2010/papers/sp10_237.pdf.

Skopeteas, Stavros & Féry, Caroline. 2014. Prosodic cues for exhaustive interpretations: a production study of Georgian intonation. In Amiridze, Nino & Reseck, Tamar & Gäumann, Manana Topadze (eds.), Advances in Kartvelian Morphology and Syntax, 79–100. Bochum: Universitätsverlag Brockmeyer.

Skopeteas, Stavros & Féry, Caroline. 2016. Focus and intonation in Georgian: Constituent structure and prosodic realization. Ms. Potsdam University.

Skopeteas, Stavros & Féry, Caroline & Asatiani, Rusudan. 2009. Word order and intonation in Georgian. Lingua 119(1). 102–127. DOI:  http://doi.org/10.1016/j.lingua.2008.09.001

Skopeteas, Stavros & Féry, Caroline & Asatiani, Rusudan. 2018. Prosodic separation of postverbal material in Georgian. In Adamou, Evangelia & Haude, Katharina & Vanhove, Martine (eds.), Information structure in lesser-described languages: Studies in prosody and syntax 199, 17–50. Amsterdam; Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/slcs.199.02sko

Szendrői, Kriszta. 2001. Focus and the syntax-phonology interface. London: University College London dissertation.

Tevdoradze, Izabela. 1978. Kartuli enis p’rosodiis sak’itxebi [Prosodic issues in the Georgian language]. Tbilisi: Tbilisi State University Publishing.

Thivierge, Sigwan. 2019. Expanding Agreement Domains: Phase Unlocking in Georgian Agreement. Ms. University of Maryland.

Vaissière, Jacqueline. 1983. Language-independent prosodic features. In Cutler, Anne & Ladd, D. Robert (eds.), Prosody: Models & measurements, 53–66. Berlin-Heidelberg: Springer. DOI:  http://doi.org/10.1007/978-3-642-69103-4_5

Vicenik, Chad & Jun, Sun-Ah. 2014. An Autosegmental-Metrical analysis of Georgian intonation. In Jun, Sun-Ah (ed.), Prosodic Typology II. 154–186. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199567300.003.0006

Vogt, Hans. 1971. Grammaire de la langue géorgienne. Oslo: Universitetsvorlaget.

von Stechow, Arnim & Uhmann, Susanne. 1986. Some remarks on focus projection. In Abraham, Werner & de Meij, Sjaak (eds.), Topic, focus and configurationality, 295–320. Amsterdam; Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/la.4.14ste

Welby, Pauline. 2003. Effects of pitch accent position, type, and status on focus projection. Language and Speech 46(1). 53–81. DOI:  http://doi.org/10.1177/00238309030460010401

Xu, Yi & Xu, Ching X. 2005. Phonetic realization of focus in English declarative intonation. Journal of Phonetics 33(2). 159–197. DOI:  http://doi.org/10.1016/j.wocn.2004.11.001

Yang, Xiaohong & Shen, Xiangrong & Li, Weijun & Yang, Yufang. 2014. How listeners weight acoustic cues to intonational phrase boundaries. PloS One 9(7). e102166. DOI:  http://doi.org/10.1371/journal.pone.0102166

Zhghenti, Sergi. 1963. Kartuli enis rit’mik’ul-melodik’uri st’rukt’ura [The rhythmic-melodic structure of Georgian]. Tbilisi: Tsodna.

Zhghenti, Sergi. 1965. Intonacionnyj stroi gruzinskogo jazyka [The intonational structure of Georgian]. In Voprosy fonetiki kartvel’skix jazykov [Phonetic issues in Kartvelian languages], 268–276. Tbilisi: Ganatleba.

Zubizarreta, Maria Luisa. 1998. Prosody, focus, and word order. Cambridge, MA: MIT Press.