1 Introduction

Prosodic structure largely reflects syntactic structure, but there are also mismatches between the two. There is rich literature and debate on syntax-prosody correspondence (e.g., edge-based theory Selkirk 1986; 1995; Align and Wrap constraints Truckenbrodt 1995; 1999; Match Theory Selkirk 2009; 2011; Elfner 2012; 2015; Ito & Mester 2013; 2015; Bennett et al. 2016; embedding-based mapping Wagner 2010), but it has largely focused on the prosody of pronounced syntactic structure. The prosodic literature that does discuss unpronounced material often assumes that silent material does not affect prosody at all (e.g. Chen 1987; Lin 1994; Truckenbrodt 1999; Elfner 2012; 2015; Hamlaoui & Szendrői 2015; 2017).

If we apply this assumption to ellipsis, a paradigmatic type of silence, then we’d expect elided material to not affect prosody. This seems like a reasonable assumption because elided material has no phonological content. But prosodic effects include not only prominence effects on pronounced material, but also effects on prosodic boundaries around (pronounced or potentially unpronounced) material.

This paper presents an experimental production study of the prosodic effects of ellipsis through an investigation of its boundary effects. I study the prosody of ellipsis in a domain with already close syntax-prosody correspondence–English coordination. Downing (1970); Wagner (2005; 2010) and others observed that coordinated clauses are followed by a larger prosodic boundary than coordinated DPs. For example, (1) involves syntactic coordination of two clauses, and (2) coordination of DPs. This syntactic difference is realized prosodically: there is a larger prosodic boundary between lettuce and and in (1) than in (2).

    1. (1)
    1. Clausal coordination
    2. He ate the lettuce and he ate it with a fork.
    1. (2)
    1. DP coordination
    2. He ate the lettuce and the bacon.

I take advantage of this insight, and ask what happens if I put ellipsis in (1) (elided material is struckthrough), as in (3B). I do so by making it the fragment answer to a question, an answer that denotes a proposition but nevertheless appears as a subpart of a proposition.

    1. (3)
    1. Clausal coordination with ellipsis
    1.  
    1. A:
    1. What did John eat?
    1.  
    1. B:
    1. He ate the lettuce and he ate it with a fork.

Is the prosodic difference between the fully overt structures (1) and (2) still present when one of them contains ellipsis? If we call the difference in the boundary size between (1) and (2) Δ1, and the boundary difference between (3B) and (2) Δ2, then how does Δ2 compare with Δ1?1

If elided material has no prosodic representation, because there is less overt structure in (3B) than in (1), we may expect the boundary to be smaller in (3B) than in (1), and thus Δ2 to be smaller than Δ1. If on the other hand elided material has prosodic representation just like pronounced material, then Δ1 should be roughly equal to Δ2.

Furthermore, the prosodic investigations of (1)–(3B) also shed light on the syntactic representation of ellipsis. Let us follow the standard assumption in the literature that syntax affects prosody, and the prosodic difference Δ1 between (1) and (2) is due to a difference between their syntactic structures. If Δ2 is comparable to Δ1, then this suggests that there is also a syntactic difference between (3B) and (2), even though part of that syntactic structure has been silenced by ellipsis in (3B). This suggests that elided material has syntactic representation, despite being silent.

The rest of this paper presents an experimental study whose results support the presence of elided material in syntax and prosody. Section 2 lays out my key assumptions about the syntax-prosody mapping. Section 3 presents the background literature on ellipsis, with opposing views on how elided material is represented syntactically and prosodically, and section 4 presents the experimental design to test the predictions of those opposing views. Section 5 presents the methods of the production study, section 6 its results and section 7 the discussion of the results. Section 8 discusses a view that is different from the assumptions laid out in section 2, and shows that it does not affect the interpretation of the results, unless we make an additional assumption that is unlikely. Section 9 presents an auxiliary measure and related results that are consistent with the view that elided material has syntactic and prosodic representation. Section 10 concludes the paper.

2 Key assumptions

This section lays out the key assumptions on which the current study is based. Theories on syntax-prosody mapping differ in how much match and mismatch there are between syntactic structure and prosodic structure, and in particular whether the prosodic structure can replicate the recursive nature of syntactic structure. Some theories (e.g., Elfner 2012; 2015; Wagner 2010) claim that it can by proposing that a syntactic phrase that dominates another syntactic phrase corresponds to a stronger prosodic phrase than the embedded phrase. Other theories (e.g., Selkirk 1986) claim that prosodic structure is much flatter than syntactic structure in lacking recursivity.

This paper follows the former type of theories due to independent evidence suggesting that the prosodic structure can indeed be recursive (e.g. Ladd 1986; 1988; Kubozono 1989; 1992; Féry & Truckenbrodt 2005; Wagner 2005; 2010; Ito & Mester 2007; 2010; 2012; 2013; Selkirk 2009; 2011; Elfner 2015; Elordieta 2015; Selkirk & Lee 2015; Cheng & Downing 2021; Bellik et al. 2022; Jabeen 2022; Kügler 2022; Ishihara & Myrberg 2023; Lee & Riedel 2023). Furthermore, the former type of theories can easily account for the observed prosodic difference between (1) and (2). In (1) lettuce is the rightmost word in a clause (he ate the lettuce), while in (2), since he ate the lettuce in (2) does not form a constituent, lettuce is only the rightmost word in a DP (the lettuce). The clause he ate the lettuce in (1) dominates more syntactic phrases than the DP the lettuce in (2), and should therefore correspond to a stronger prosodic phrase than the DP the lettuce in (2), leading to a larger prosodic boundary following lettuce in (1) than in (2).

For concreteness, I illustrate how the prosodic contrast between (1) and (2) can be captured by a specific mapping theory that allows recursive prosodic structure–Elfner’s (2015) version of Match Theory. I will introduce the relevant components of Match Theory, and propose an auxiliary assumption that is necessary for independent reasons. These components together can capture the observed prosodic difference between (1) and (2).

2.1 Assumptions about syntax-prosody mapping

Elfner’s (2015) version of Match Theory, which is based on Selkirk (2011) and also appears in Elfner (2012), posits that prosodic structure replicates the dominance relations in the syntactic structure: a lexical head (e.g. V0, N0, Adj0) is mapped to a prosodic word (ω) (Match-Word, (4)), a functional head (e.g. D0, P0) is mapped to a clitic (Match-Clitic, (5)), and a syntactic maximal projection (XP) is mapped to a phonological phrase (φ), and the prosodic structure replicates the dominance relations in the syntactic structure (Match-Phrase, (6)). I also follow Elfner’s (2015) assumption that a non-branching constituent that is simultaneously an X0 and XP is mapped to a ω.

    1. (4)
    1. Match-Word
    2. V0, N0, Adj0 ➔ ω
    1. (5)
    1. Match-Clitic
    2. D0, P0 ➔ C
    1. (6)
    1. Match-Phrase
    2. XP ➔ φ
    1. “For every syntactic phrase (XP) in the syntactic representation that exhaustively dominates a set of one or more terminal nodes α, there must be a prosodic domain (ϕ) in the phonological representation that exhaustively dominates all and only the phonological exponents of the terminal nodes in α.”

But not all syntactic constituents are mapped onto prosody. An assumption crucial to the literature on syntax-prosody mapping is that silent material (e.g., phonologically empty heads and their projections, and movement traces) is not mapped onto prosody, based on evidence from Chichewa (Truckenbrodt 1999), Xiamen Chinese (Chen 1987; Lin 1994) and Irish (Elfner 2012; 2015). The syntactic structures assumed for (1) and (2) are in Figure 1 and Figure 2 respectively:2

Figure 1
Figure 1

Syntactic structure for (1).

Figure 2
Figure 2

Syntactic structure for (2).

Following Elfner’s Match principles and the principle that does not map phonologically empty heads to prosodic structure, (1) and (2) have the prosodic structures in Figure 3 and Figure 4 respectively.3 In all the prosodic structures, the right parenthesis marks the right boundary following lettuce–the right edge of the topmost φ in which lettuce is the final word.

Figure 3
Figure 3

Prosodic structure for (1).

Figure 4
Figure 4

Prosodic structure for (2).

So far I have assumed only one prosodic category above the word level–phonological phrase (φ). All syntactic XPs are mapped to φs, regardless of the type of the XP. This follows Elfner’s (2015) analysis of Irish, for which she noted that there was no evidence suggesting that another prosodic category was necessary. But many theories in the prosodic literature posit another prosodic category above the word level–intonational phrase (ι) (e.g. Selkirk 1986; 1995; 2009; 2011). These theories distinguished between syntactic clauses or assertions (roughly TP or CP) and subclauses (anything smaller than a TP, e.g., a DP), and claimed that clauses or assertions should correspond to ι, while subclauses to φ (Match-Clause, (7)).

    1. (7)
    1. Match-Clause
    2. TP, CP ➔ ι

Because the first conjunct in (1) he ate the lettuce is a clause, but the first conjunct in (2) is just a DP the lettuce, Match-Clause would map the former to a different category (ι) from the latter (φ).

Instead of Match-Clause, some prosodic theories claim that a speech act or an illocutionary force projection is mapped to an ι (Match-Speech-Act; e.g. Güneş 2014; 2015; Ishihara 2022). These theories will not affect the results in this paper unless we say that each of the conjuncts in (1) must be a separate speech act, whereas the conjuncts in (2) are not. I don’t think the conjuncts in (1) have to necessarily be separate speech acts, but let us assume that they do in order to have a complete survey of all the possible prosodic structures, then Match-Speech-Act would map the first conjunct in (1) to an ι, but the first conjunct in (2) to a φ.

Section 8 will address the prosodic structures created by Match-Clause or Match-Speech-Act, and argue that whether a syntactic clause is mapped to an ι or a φ does not affect the results in this paper, as long as the strength of a prosodic domain depends (at least in part) on the levels of syntactic embedding, a fact with empirical support (e.g. Ladd 1986; 1988; Kubozono 1989; Wagner 2005; 2010). Because Match-Clause and Match-Speech-Act do not make a difference to the results, the rest of the paper will assume that all syntactic phrases are mapped to φs.

2.2 Auxiliary assumption about the phonetic effects of prosodic structure

Abstract prosodic structures have phonetic effects that we can hear. We thus need a theory that connects the prosodic structure to gradient phonetic effects in prominence and phrasing, such as effects in duration, pitch and intensity. I therefore add the following assumption to Match Theory: the more levels a node dominates in the prosodic structure, the phonetically “stronger” this node is. Phonetic “strength” can be reflected by phonetic effects at the left and right edges of this node, such as domain-initial strengthening and domain-final lengthening. By this assumption, a φ must be phonetically stronger than its daughter φ’ because the mother φ dominates one more level of φ than the daughter. This has been observed to be the case in Basque, for example: Elordieta (2015) found that the degree of pitch reset at the left edge of a nonminimal φ (i.e. a φ that dominates another φ) depends on the number of nonminimal φs it embeds, and the more levels of embedding, the stronger the pitch reset.

Having introduced Match Theory and the auxiliary assumption about mapping from prosodic structure to gradient durational effects, let us apply this framework to (1) and (2). I focus on the highest φs in which the word lettuce is final, which is he ate the lettuce in (1) and the lettuce in (2). The φ he ate the lettuce in (1) dominates two levels of φ, while the φ the lettuce in (2) dominates no φ. The strength of a phrase is reflected at the right edge by pre-boundary lengthening effects: as Wightman et al. (1992) showed, the final rime of a word is lengthened before a phrase boundary, and the stronger / larger this boundary, the longer the rime. Because the word lettuce in (1) is followed by a stronger phrase boundary than in (2), we would thus expect the last rime of lettuce in (1) to be longer than that in (2).

Having shown how Match Theory along with my auxiliary assumption can capture the prosodic difference between (1) and (2), I now present the main research question–that is, whether elided material has prosodic representation.

3 Syntactic and prosodic representation of ellipsis

To answer this question, I put ellipsis in (1) and turn it into (3B) by making it the fragment answer to the question in (3A), repeated in (8A). Assuming that answers to questions denote propositions (e.g., (8B1)), fragment answers are those that nevertheless appear as a subpart of a proposition (e.g., (8B2)).

    1. (8)
    1. A:
    1. What did John eat?
    1.  
    1. B1:
    1. He ate the lettuce.
    1.  
    1. B2:
    1. The lettuce.

A common analysis of fragment answers posits that they are still a full clause, but with clausal ellipsis (e.g., Merchant 2005). According to this analysis, (3B) involves movement of the lettuce, the phrase that survives ellipsis (the remnant), to a higher position (e.g., Spec, CP), plus deletion of the clause he ate ti (Figure 5). Not all syntactic analyses of fragment answers posit movement. For example, Griffiths (2019) claims that the remnants of ellipsis stay in-situ, and ellipsis deletes the non-remnants. This in-situ analysis of ellipsis makes largely the same prosodic predictions as the analysis of ellipsis that involves movement, and section 7 will discuss the in-situ analysis in depth. For now I will assume that ellipsis does involve movement.

Figure 5
Figure 5

Syntactic structure for (3B), if the remnant moves.

The prosodic structure for (3B) depends on two factors: (a) how elided material is represented syntactically; and (b) how elided material is represented prosodically.

There are mainly two approaches to the syntactic representation of ellipsis. Some argue that it is fully present in narrow syntax but later deleted at PF (e.g., Johnson 2001; Merchant 2001; 2005; van Craenenbroeck 2009). Others argue that elided material is not fully present in syntax by assuming that elided material is copied at LF (Chung et al. 1995), is partially present in syntax as a pronoun (e.g. Landau 2023), or has an enriched meaning by a discourse rule (Groenendijk & Stokhof 1984; Ginzburg & Sag 2001; Jacobson 2016).

If we follow the first approach that elided material is fully present in syntax, then we can further ask how it is mapped onto prosody. There are two possibilities: (a) it is not represented in prosody at all like other phonologically empty material, leading to the prosodic structure in Figure 6, where the relevant φ (in bold) dominates no φ; or (b) it is represented in prosody despite having no phonological content, leading to Figure 7, where the relevant φ dominates a level of φ. Approaches that assume elided material is not fully present in syntax in the first place would predict the prosodic structure in Figure 6, and the question of prosodic representation of ellipsis is moot.

Figure 6
Figure 6

Prosodic structure for (3B), if elided material is not represented in syntax or prosody.

Figure 7
Figure 7

Prosodic structure for (3B), if elided material is represented in syntax and prosody.

4 Design to test the research question

I test the prosody of (3B) by putting it into the paradigm introduced in section 1. First, I compare the prosodic difference between sentences with fully overt structures in the first conjunct (Control Condition; (9A1&2)). To make sure the difference between the sentences is minimal, I make (9A1&2) answers to double wh-questions (9Q1&2).

    1. (9)
    1. a.
    1. Control Condition; Clausal Conjuncts
    1.  
    1.  
    1. Q1:
    1. What did John eat and how did he eat it?
    1.  
    1.  
    1. A1:
    1. [He ate the lettuce] and [with a fork].
    1.  
    1. b.
    1. Control Condition; DP Conjuncts
    1.  
    1.  
    1. Q2:
    1. Which vegetable and which meat did John eat?
    1.  
    1.  
    1. A2:
    1. He ate [the lettuce] and [the bacon].

I make (9Q1&2) double wh-questions rather than single wh-questions to ensure the answers have the same information structure and focus structure. If the question were a single wh-question like What did John eat?, (9A1&2) would not have the same information structure and focus structure, which may confound the results. (9A1) would not only answer that single wh-question, but also provide additional information on what John ate the lettuce with. In contrast, (9A2) would only answer that single wh-question and nothing more. Also, if the leading question were the single wh-question What did John eat?, (9A1) would put double focus on the lettuce and with a fork, while (9A2) would only put a single focus on the entire conjunction the lettuce and the bacon. But double wh-questions do not have these issues because (9A1&2) answer the double wh-questions and provide no further information, and they both involve focus on each conjunct.

It has been shown that both the length of the most recent constituent before the boundary and that of the upcoming constituent following the boundary affect the size of the boundary (e.g., Gee & Grosjean 1983; Jun 2000; 2003; Selkirk 2000; Watson & Gibson 2004). Therefore, to control for the total number of syllables and make sure that the number of syllables before and after the boundary is the same within the Control Condition (9A1&2) and within the Critical Condition (to be presented in (10A1&2)), I put ellipsis in the second conjunct in (9A1) and (10A1). Because the experiment mainly focuses on the prosodic boundary following the first conjunct (i.e. the right boundary following lettuce), all that matters to us here is the syllable count before and after this boundary; whether there is ellipsis in the second conjunct does not matter.

A reviewer asked whether the left boundary preceding and affects the right boundary following lettuce (i.e. the boundaries in bold in He ate the lettuce) (and with a fork): if the left boundary preceding and is very strong, will it strengthen the right boundary following lettuce? An important consequence of adopting Match Theory that permits a recursive prosodic structure is that it won’t: adjacent left and right boundaries do not need to match in strength because sisters in a recursive prosodic structure do not need to have the same prosodic category and the same level of embedding. This contrasts with the strict layer hypothesis (e.g. Selkirk 1981; 1984; Beckman & Pierrehumbert 1986; Nespor & Vogel 1986; Pierrehumbert & Beckman 1988), which does require the right boundary to match the following left boundary in strength because the sisters in the prosodic hierarchy must have the same category and the same level of embedding. Therefore, material in the second conjunct does not affect the boundary following lettuce, and neither does material in the first conjunct affect the boundary preceding and.

But the second conjunct does involve ellipsis in clausal coordination but not in DP coordination; can the boundaries around the second conjunct bear on the research question? Section 9 discusses this question and argues that due to the setup of the items, the second conjunct is not a great place to study the research question. But I still measured the prosodic strength of the second conjunct with an auxiliary measure of its left edge, and report the results in section 9.

I expect a significantly larger prosodic boundary following lettuce in (9A1) than in (9A2). In (9A1), the prosodic boundary corresponds to the bolded right φ-boundary in Figure 3, which dominates 2 φs (whether the elided material in the second conjunct is represented does not affect that boundary). In (9A2), the prosodic boundary corresponds to the bolded right φ-boundary in Figure 4, which dominates no φ. I call the difference in this prosodic boundary between (9A1) and (9A2) Δ1, and compare Δ1 with the prosodic difference Δ2 between a phrase that contains ellipsis (i.e. the first conjunct in (10A1)) and one that doesn’t (i.e. the first conjunct in (10A2)) (Critical Condition). (10A1&2) differ in the location of ellipsis: there is ellipsis inside the first conjunct in (10A1) but outside the first conjunct in (10A2).

    1. (10)
    1. a.
    1. Critical Condition; Clausal Conjuncts
    1.  
    1.  
    1. Q1:
    1. What did John eat and how did he eat it?
    1.  
    1.  
    1. A1:
    1. [He ate the lettuce] and [with a fork].
    1.  
    1. b.
    1. Critical Condition; DP Conjuncts
    1.  
    1.  
    1. Q2:
    1. Which vegetable and which meat did John eat?
    1.  
    1.  
    1. A2:
    1. He ate [the lettuce] and [the bacon].

Depending on whether elided material is represented in syntax and prosody or not, the clausal coordination (10A1) would be mapped to the prosodic structure in Figure 8 (if elided material has no syntactic or prosodic representation), where the prosodic boundary following lettuce dominates no φ, or the prosodic structure in Figure 9 (if elided material has syntactic and prosodic representation), where the prosodic boundary following lettuce dominates 1 φ.

Figure 8
Figure 8

Prosodic structure for (10A1), if elided material is not represented in syntax or prosody.

Figure 9
Figure 9

Prosodic structure for (10A1), if elided material is represented in syntax and prosody.

The prosodic structure of the lettuce and the bacon in (10A2) is Figure 10. I have left out the elided material outside the coordination because it does not affect the prosodic boundary following lettuce in (10A2).

Figure 10
Figure 10

Prosodic structure of the lettuce and the bacon in (10A2).

If elided material is not represented in syntax or prosody, then we expect there to be no prosodic difference within the Critical Condition (i.e. Δ2 = 0), and Δ2 should therefore be significantly smaller than Δ1. But if elided material is represented in syntax and prosody, then we expect there to be a significant prosodic difference within the Critical Condition (i.e. Δ2 > 0). Since the bolded φ dominates 1 φ in Figure 9 (if elided material has prosodic representation) but no φ in Figure 10, their difference in the Critical Condition is 1 φ. But within the Control Condition, the bolded φ dominates 2 φs in Figure 3 but no φ in Figure 4, with the difference being 2 φs. We thus expect Δ2, the prosodic difference within the Critical Condition, to be slightly smaller than Δ1, the prosodic difference within the Control Condition, if elided material has prosodic representation.

5 Methods

5.1 Materials

The materials consisted of 20 target sentences (2 conditions x 2 coordination types x 5 sets), with (9A1&2) and (10A1&2) exemplifying a set. The two conditions were Critical and Control, and the two coordination types were clausal and DP. To elicit the intended information structure, each target sentence was shown to the subjects along with a leading context sentence and a wh-question. For example, for (10A1), the following materials were presented to the speaker. Every set of items had the same context and question.

    1. (11)
    1. Context: John went to a diner.
    2. Question: What did John eat and how did he eat it?
    3. Answer: The lettuce and with a fork.

The speaker was to read the context silently, and say the question and the answer in the given order. Every speaker saw all 20 items. There were 104 filler items, which all contained a context, a question and an answer.

5.2 Participants

I conducted a production study with eighteen native speakers of North American English (fourteen female, four male, age 19 to 50), who were all university students and working professionals living in Boston, USA or Oxford, UK. They were remunerated a small sum for their time, and granted their written consent to being tested.

5.3 Data collection

Recording took place in two events. The first event took place in a sound-attenuated booth at the Massachusetts Institute of Technology for three of the eighteen participants, and the second event took place in a quiet, non-reverberant room at Magdalen College, University of Oxford for the other fifteen participants. In each event, participants were seated in front of a computer, which displayed one context-question-answer trio at a time. The stimuli plus fillers were presented in pseudo-randomized order, and the order of items was different for every participant. Participants were given instructions about the task at the beginning of the experiment, which asked them to first read each trio quietly to themselves, and only proceed to read it out loud when they were ready. They could take as long as they wanted. They were asked to imagine they were playing three different roles in each trio, and to act out the dialogues naturally rather than reading the sentences mechanically. If the participants were not satisfied with their rendition of an item (a common reason was that they stumbled over some words), they were allowed to say it again. If they asked to repeat an item, we only considered the rendition they were happy with, and discarded the previous renditions.

5.4 Data processing and analysis

The recordings were aligned with the Montreal Forced Aligner (McAuliffe et al. 2017), using the pretrained acoustic model English (US) ARPA acoustic model (McAuliffe & Sonderegger 2024), and duration was calculated with the forced-aligned boundaries. I measured the duration of the last rime of the word immediately before the prosodic boundary (e.g., for (9) & (10), uce of lettuce), and the pause after that word (e.g., the pause following lettuce), if there is such a pause.

I fitted two linear mixed effects models, whose dependent variables were (a) the duration of the last rime; and (b) the sum of the duration of the last rime and the duration of the pause and whose fixed effects were the same–coordination (clausal vs. DP) and condition (Critical vs. Control). I chose the sum duration rather than the pause duration as the second measure for two reasons. First, the forced aligner was not always accurate in its alignment, and the pause it aligned often included part of the last rime. Second, because the durations of the last rime and the pause are both correlated with the strength of the boundary of interest, their sum should also be correlated. I calculated p-values using Satterthwaite’s degrees of freedom method. The models included random intercepts and slopes by speaker and item group where those effects didn’t result in a singular fit.

5.5 Predictions

I expect to replicate the experimental findings by Downing (1970), Wagner (2005; 2010) and others with a significant prosodic difference between coordination types in the Control Condition Δ1, which would be realized as a longer rime and pause for clausal coordination (9A1) than for DP coordination (9A2).

The question is whether there is also a significant prosodic difference within the Critical Condition, and if so, how that difference Δ2 compares with the difference within the Control Condition Δ1. If there is no significant difference in the Critical Condition (Figure 11), then elided material may not be present in the prosodic structure.

Figure 11
Figure 11

Predicted data if elided material is not present in the prosodic structure.

If there is a significant prosodic difference between coordination types within the Critical Condition, where the rime and the pause are both significantly longer for clausal coordination (10A1) than for DP coordination (10A2), then we can further ask what is the reason for this difference by comparing it with the difference in the Control Condition. If the difference within the Critical Condition Δ2 is slightly smaller than that of the Control Condition Δ1 (i.e., the slope in the Critical Condition is slightly less steep than the slope in the Control Condition in Figure 12), then this suggests that elided material is present prosodically.

Figure 12
Figure 12

Predicted data if elided material is present in the prosodic structure.

6 Results

I begin by discussing the durational results. Within the Control Condition, the final rime before and is on average 14.0 ms longer in clausal coordination than in DP coordination (p < 0.05; Figure 13), and the sum of the final rime and the pause before and is on average 30.4 ms longer in clausal coordination than in DP coordination (p < 0.01; Figure 14). This is expected and consistent with previous findings that different syntactic structures correspond to different prosodic realizations in coordination (e.g., Wagner 2005; 2010). The error bars in the figures represent standard errors.

Figure 13
Figure 13

Duration of the final rime before and.

Figure 14
Figure 14

Duration of the final rime plus the pause before and.

Within the Critical Condition, the final rime before and is on average 20.2 ms longer in clausal coordination than in DP coordination (p < 0.01), and the sum of the final rime and the pause before and is on average 39.5 ms longer in clausal coordination than in DP coordination (p < 0.001). This suggests that the prosodic boundary following that rime is larger in clausal coordination than in DP coordination, even though on the surface the first conjunct of clausal coordination looks the same as the first conjunct of DP coordination due to ellipsis.

Furthermore, there is no interaction between coordination type and condition type–that is, the differences in rime duration and the sum of rime and pause duration within the Critical Condition are not significantly different from those within the Control Condition (i.e., no difference between the differences Δ1 and Δ2). A post-hoc power analysis showed that the findings of significant main effect of coordination types and the lack of significant interaction effect had 99.7% power.

7 Discussion

The durational results indicated that within the Critical Condition, phrases that contain elided material have larger boundaries than phrases that do not contain any elided material, even though these phrases have the same surface structure. This suggests that prosody is sensitive to structural differences between clausal coordination and DP coordination, whether or not the underlying structure contains elided material. This is expected if elided material is present in prosody. The fact that there is no significant difference between the differences Δ1 and Δ2 may be somewhat surprising because even if elided material has prosodic representation, we expect Δ2 to be slightly smaller than Δ1. I suspect this difference, which is only of 1 φ, was too small to be detected in this experiment. In any case, the lack of interaction is not consistent with the view that elided material has no prosodic representation, which predicts Δ2 to be roughly zero and significantly smaller than Δ1.

So far I have assumed that fragment answers are derived by syntactic movement, but not all syntactic analyses of fragment answers posit movement. For example, Griffiths (2019) claims that the remnants of ellipsis stay in-situ, and ellipsis deletes the non-remnants. This in-situ analysis of ellipsis makes the same predictions as an analysis that posits movement, except that it does not predict any difference between Δ1 and Δ2, which is fully compatible with the experimental results. The rest of this section discusses the syntactic structures posited by the in-situ analysis and their corresponding prosodic structures. The in-situ analysis of ellipsis would analyze (9A1)–(9A2), the sentences in the Control Condition as in Figure 15 and Figure 16.

Figure 15
Figure 15

Syntactic structure for (9A1), if the remnant stays in-situ.

Figure 16
Figure 16

Syntactic structure for (9A2).

Match Theory would map these syntactic structures to the prosodic structures in Figure 17 and Figure 18. I do not represent the elided material in the second conjunct of (9A1) in Figure 17 because whether or not it is represented does not matter to the prosodic boundary following lettuce (as was explained in section 4). Later for the Critical Condition (10A1) we will get to a choice point of whether elided material should be represented in the prosodic structure, and there the elided material in the second conjunct will be treated in the same way as the elided material in the first conjunct. The topmost φ in which lettuce is the last word (whose right boundary is marked by a right parenthesis in the trees) dominates 2 φs in Figure 17 but no φ in Figure 18–leading to a stronger prosodic boundary following lettuce in (9A1) than (9A2).

Figure 17
Figure 17

Prosodic structure for (9A1).

Figure 18
Figure 18

Prosodic structure for (9A2).

An in-situ analysis of ellipsis would analyze (10A1)–(10A2), the sentences in the Critical Condition as in Figure 19 and Figure 20.

Figure 19
Figure 19

Syntactic structure for (10A1), if the remnant stays in-situ.

Figure 20
Figure 20

Syntactic structure for (10A2), if the remnant stays in-situ.

Depending on whether or not elided material is represented in syntax and prosody, the clausal coordination (10A1) would be mapped to the prosodic structure in Figure 21 (if elided material has no syntactic or prosodic representation) or Figure 22 (if elided material has syntactic and prosodic representation).

Figure 21
Figure 21

Prosodic structure for (10A1), if the remnant stays in-situ, and elided material is not represented in syntax or prosody.

Figure 22
Figure 22

Prosodic structure for (10A1), if the remnant stays in-situ, and elided material is represented in syntax and prosody.

The prosodic structure of the lettuce and the bacon in (10A2) is in Figure 23. I have left out the elided material because whether or not the elided material is represented in prosody does not matter to the prosodic boundary following lettuce in (10A2)–the relevant phonological phrase only includes overt material anyway.

Figure 23
Figure 23

Prosodic structure of the lettuce and the bacon in (10A2), if the remnant stays in-situ.

The topmost φ in which lettuce is the last word dominates no φ in Figure 21, 2 φs in Figure 22, and no φ in Figure 23. This means that if elided material has no prosodic representation as in Figure 21, then we expect the prosodic boundary following lettuce to be the same in (10A1) and (10A2), contrary to the result of this experiment. If elided material has prosodic representation as in Figure 22, then we expect the prosodic boundary following lettuce to stronger in (10A1) than in (10A2), and this prosodic difference should equal the prosodic difference between (9A1) and (9A2) in the Control Condition. The experimental results are fully compatible with this prediction, suggesting that elided material has prosodic representation. Therefore, whether or not ellipsis involves movement, the experimental results are consistent with the syntactic and prosodic representation of elided material.

8 Theories that distinguish intonational phrases and phonological phrases

As mentioned in section 2.1, some theories on the syntax-prosody mapping claimed that syntactic clauses should correspond to ι, while subclauses correspond to φ (e.g. Selkirk 1986; 1995; 2009; 2011). Other theories claimed that a speech act or an illocutionary force projection is mapped to an ι (e.g. Güneş 2014; 2015; Ishihara 2022). Assuming that the conjuncts in clausal coordination are separate speech acts, whereas the conjuncts in DP coordination are not, these theories effectively produce the same prosodic structures–the conjuncts in clausal coordination should correspond to ιs, whereas those in DP coordination should correspond to φs. This section discusses these prosodic structures produced by these theories, and argues that they do not affect the interpretation of the experimental results, unless we make an additional unlikely assumption.

Before delving into the predictions of these theories, it is crucial to first discuss how this distinction between ι and φ may lead to gradient phonetic effects such as pre-boundary lengthening. To my knowledge, these theories do not have any explicit proposal about this. There are four logical possibilities I can think of, which I call Approaches I, II, III and IV respectively. Approach I says that the label of the prosodic phrase does not have any effect on lengthening, and the degree of lengthening is only correlated with the number of levels dominated by a node in the prosodic structure, as I proposed in section 2.2. Approach II claims that the levels of embedding matter, but on top of that, an ι label boosts the pre-boundary lengthening effects compared to a φ label. Approach II would thus predict that an ι that embeds the same number of levels as a φ should induce more lengthening. Approach III claims that the levels of embedding do not have any effect on pre-boundary lengthening, and all that matters is the label of the prosodic phrase. As an extreme example, Approach III would predict that an ι that does not embed any other prosodic phrase (e.g. a one-word fragment answer) should induce more lengthening than a φ that dominates other prosodic phrases. Approach IV, which was suggested by a reviewer, claims that the levels of embedding only matter within a prosodic category (i.e. within φ and within ι), but once we cross prosodic categories from a φ to an ι, the ι should trigger categorically greater pre-boundary lengthening effects, and these effects ‘override’ or ‘swamp’ the effects of recursive φ edges. Like Approach III, Approach IV would predict that an ι that does not embed any other ι or φ should induce more lengthening than a φ that dominates other φs. Having laid out these four approaches, I will now discuss their predictions about the target sentences. First, the syntactic structures of the sentences in the Control Condition (9A1-2) are in Figure 24 and Figure 25, assuming that ellipsis involves movement of the remnant.

Figure 24
Figure 24

Syntactic structure for (9A1), if the remnant moves.

Figure 25
Figure 25

Syntactic structure for (9A2).

Theories that map clauses to ιs would map these sentences in the Control Condition (9A1-2) to the prosodic structures in Figure 26 and Figure 27 respectively. Because the second conjunct in these sentences does not really matter to the prosodic boundary of interest (as was explained in section 4), I have not represented the elided material in the second conjunct in Figure 26.

Figure 26
Figure 26

Prosodic structure for (9A1), if syntactic clauses are mapped to ιs.

Figure 27
Figure 27

Prosodic structure for (9A2), if syntactic clauses are mapped to ιs.

In Figure 26 lettuce is at the right edge of the ι that corresponds to the first clause; this ι dominates two φs. In Figure 27 lettuce is at the right edge of a φ which dominates no φ. Approaches I, II, III and IV would all predict more lengthening for lettuce in Figure 26 than Figure 27, but for different reasons. Approach I makes this prediction because the ι in Figure 26 embeds more levels than the φ in Figure 27. Approach II makes this prediction not only because of that, but also because lettuce is ι-final in Figure 26, but lettuce is φ-final in Figure 27, and the ι label boosts domain-final lengthening. Approaches III and IV make this prediction not because of levels of embedding, but just because lettuce is ι-final in Figure 26 and φ-final in Figure 27. Therefore, all four approaches can account for the prosodic difference within the Control Condition.

These four approaches can also account for the prosodic difference within the Critical Condition, but Approaches I and II make the same predictions as an approach that does not distinguish between ιs and φs, while Approaches III and IV manage to predict the experimental results regardless of how elided material is represented. First, the syntactic structures of the sentences in the Critical Condition are in Figure 28 and Figure 29.

Figure 28
Figure 28

Syntactic structure for (10A1), if the remnant moves and elided material has syntactic representation.

Figure 29
Figure 29

Syntactic structure for (10A2), if the remnant moves and elided material has syntactic representation.

Depending on whether elided material is represented in syntax and prosody or not, all four approaches would map clausal coordination (10A1) to the prosodic structure in Figure 30 (if elided material has no syntactic or prosodic representation) or Figure 31 (if elided material has syntactic and prosodic representation). It is worth noting that if the syntax-prosody mapping principles ignore elided material, then they will view the first conjunct in (10A1) as both a non-branching CP and a DP simultaneously, leading to the crucial question of whether it should be mapped to an ι or a φ. Elfner (2015) ran into a parallel issue in Irish with a constituent that is simultaneously an XP and a X0, and claimed that they are always treated like minimal projections and mapped to an ω. But if we follow the same principle and map the first conjunct in (10A1) to a φ, and as we will see shortly, the first conjunct in (10A2) also corresponds to a φ, then none of the four approaches will be able to predict a significant difference within the Critical Condition, and we will be led to the same conclusion as an approach that does not posit ι–that is, elided material does have syntactic and prosodic representation. To give the four approaches that posit ι the best chance, I will thus assume that a constituent that is simultaneously a clause and a subclause is mapped to an ι.

Figure 30
Figure 30

Prosodic structure for (10A1), if syntactic clauses are mapped to ιs, and elided material is not represented in syntax or prosody.

Figure 31
Figure 31

Prosodic structure for (10A1), if syntactic clauses are mapped to ιs, and elided material is represented in syntax and prosody.

The four approaches would map DP coordination (10A2) to the prosodic structure in Figure 32.

Figure 32
Figure 32

Prosodic structure for (10A2), if syntactic clauses are mapped to ιs.

I discuss the predictions of the four approaches. Because Approach I makes essentially the same claim as the auxiliary assumption laid out in section 2.2–that lengthening effects are correlated with the levels of embedding, but not the label of the prosodic phrase, it makes the same prediction as the predictions I discussed in section 5.5. Thus, according to Approach I, the experimental results suggest that elided material has prosodic representation.

Approach II also makes the same prediction as those discussed in section 5.5: it predicts a difference within the Critical Condition. If elided material is not represented syntactically or prosodically, Approach II predicts that this difference within the Critical Condition is much smaller than the difference in the Control Condition; if elided material is represented in prosody, Approach II predicts a slightly smaller difference within the Critical Condition than within the Control Condition. Thus, the lack of interaction in the experimental results is consistent with prosodic representation of ellipsis. I explain in detail why Approach II makes these predictions. It predicts a difference within the Critical Condition, specifically more lengthening effects for lettuce in Figure 30 or Figure 31 than in Figure 32 because lettuce is φ-final in Figure 32 but ι-final in Figure 30 and Figure 31. If elided material is not represented syntactically or prosodically as in Figure 30, because the ι in Figure 30 dominates the same number of levels as the φ in Figure 32, the prosodic difference within the Critical Condition is only due to the difference in labels between an ι and a φ. This should be smaller than the prosodic difference within the Control Condition, which is due to both a difference in labels, and a difference in embeddings. Thus, Approach II would predict a smaller prosodic difference within the Critical Condition than within the Control Condition (i.e. significant interaction), if elided material has no prosodic representation. If elided material has prosodic representation as in Figure 31, then the difference within the Critical Condition is due to both the labels and embeddings, as is the difference within the Control Condition, and thus should predict a slightly smaller difference within the Critical Condition than within the Control Condition.

Finally, Approaches III and IV predict a difference within the Critical Condition and no interaction (i.e. no difference between that difference and the difference in the Control Condition), regardless of whether elided material has prosodic representation. In Figure 30 and Figure 31 lettuce is ι-final, and is thus predicted to be lengthened more than lettuce in Figure 32. This difference within the Critical Condition should equal the difference within the Control Condition, which according to Approaches III and IV is also due to a difference between an ι and a φ. Therefore, Approaches I and II do not change our conclusion that elided material is present in syntax and prosody, but Approaches III and IV make it inconclusive.

There is a lot of empirical evidence against Approach III. It has been shown that embeddings do lead to gradient phonetic effects such as pre-boundary lengthening (e.g. Ladd 1986; 1988; Kubozono 1989; Wagner 2005; 2010). For example, Wagner (2005) showed that coordination of three names Lysander and Demetrius and Hermia can involve prosodic boundaries with different strengths between the conjuncts. Under the interpretation that Lysander and Demetrius form a group that is then conjoined with Hermia, the prosodic boundary between Lysander and and is weaker than the boundary between Demetrius and and. This challenges Approach III because coordination of three DPs should be mapped to a series of φs, but not involve any ι, and thus should not differ in boundary strength according to Approach III.

There is also evidence from perception studies that is inconsistent with Approaches III and IV: it has been shown that listeners don’t interpret gradient cues as a reflection of specific categories. They can detect relative differences in the strength of boundaries, but cannot categorize boundaries reliably (e.g. Price et al. 1991; de Pijper & Sanderman 1994; Krivokapić 2023). In a perception study conducted by Krivokapić, speakers listened to recordings of four types of English sentences that involve a within-ω, ω, φ and ι boundary respectively, as specified in the Autosegmental-Metrical Model. They were asked to distinguish among the prosodic boundaries in these sentences. A Gaussian mixture model analysis examined the number of clusters that best fit the observed data without a predetermined number of clusters, and found that three clusters gave the best fit, suggesting there were only three distinguishable categories, and the distinction between φ and ι was collapsed. This challenges Approach III and IV, which predict a clear distinction in boundary strength between φ and ι. De Pijper & Sandeman showed that in Dutch, listeners have clear percepts of relative boundary strength, but the boundary strengths as perceived by listeners do not cluster around a limited number of target values. This also challenges Approach III and IV, which predict listeners to make a distinction of exactly two levels of boundary strengths above the word level, contrary to the many relative levels they manage to distinguish between in this experiment.

A reviewer suggested experimental design to test Approach IV directly, which I present here with some modifications. We can compare a clause with few embeddings with a subclause with many embeddings. For example, hibernated in (12a) is at the right edge of a clause that dominates a VP, which should be mapped to a right ι edge that dominates a φ edge in (13a). In contrast, lettuce in (12b) is at the right edge of a DP that dominates two XPs, which should be mapped to three right φ edges in (13b). Approach IV would predict more pre-boundary lengthening on hibernated than lettuce, while the approaches that consider syntactic embeddings would not necessarily make this prediction. I leave to future research the execution of such an experiment.

    1. (12)
    1. a.
    1. [TP Amelia [VP hibernated]] and Bill vegetated.
    1.  
    1. b.
    1. John saw [DP the girl [PP with [DP the lettuce]]] and the boy with the pears.
    1. (13)
    1. a.
    1. Amelia hibernated φ)ι) and Bill vegetated.
    1.  
    1. b.
    1. John saw the girl with the lettuce φ)φ)φ) and the boy with the pears.

To summarize, Approaches I and II, which assume that the levels of embedding can affect the boundary strengths of φs and ιs, make the same predictions as those discussed in section 5.5, and therefore lead to the same conclusions. Approach III, which assumes that only the prosodic label can affect the boundary strengths, has met a lot of empirical challenges. A moderated version of Approach III, Approach IV, is not completely consistent with the empirical findings either, but following a reviewer’s suggestion I have suggested experimental design to directly test it.

9 Auxiliary measure

Besides the durational measures examined in the experiment, I had an auxiliary measure–the vowel quality of the vowel in and. This measure does not bear on the boundary of interest (i.e. the right boundary following lettuce), but rather on the left boundary before and:

    1. (14)
    1. He ate the lettuce) (and with a fork.

During the recordings, I observed variable pronunciation of and, especially in its vowel: it was sometimes pronounced as [ænd] with a full vowel, sometimes as [ənd] with a schwa, and occasionally fully reduced as [n]. I also observed a tendency: the vowel of and seemed more likely to be reduced when and is in a weaker prosodic domain. Because and is usually phrased with the following material, I assume the vowel reduction in and indicates the strength of that phrase–the one that includes the second conjunct.

The second conjunct is not a great place to test the research question because the second conjunct always involves ellipsis in clausal coordination, regardless of condition; and the second conjunct does not involve ellipsis in DP coordination, regardless of condition:

    1. (15)
    1. a.
    1. Control Condition; Clausal Conjuncts
    1.  
    1.  
    1. Q1:
    1. What did John eat and how did he eat it?
    1.  
    1.  
    1. A1:
    1. [He ate the lettuce] and [he ate it with a fork].
    1.  
    1. b.
    1. Control Condition; DP Conjuncts
    1.  
    1.  
    1. Q2:
    1. Which vegetable and which meat did John eat?
    1.  
    1.  
    1. A2:
    1. He ate [the lettuce] and [the bacon].
    1. (16)
    1. a.
    1. Critical Condition; Clausal Conjuncts
    1.  
    1.  
    1. Q1:
    1. What did John eat and how did he eat it?
    1.  
    1.  
    1. A1:
    1. [He ate the lettuce] and [he ate it with a fork].
    1.  
    1. b.
    1. Critical Condition; DP Conjuncts
    1.  
    1.  
    1. Q2:
    1. Which vegetable and which meat did John eat?
    1.  
    1.  
    1. A2:
    1. He ate [the lettuce] and [the bacon].

If elided material is represented syntactically and prosodically, then we expect the left boundary preceding and to be stronger in clausal coordination than in DP coordination because the second conjunct of clausal coordination has more levels of embedding than that of DP coordination. If elided material has no syntactic or prosodic representation, then we may expect that left boundary to be roughly the same for clausal and DP coordinations. But because the second conjunct in clausal coordination is different from that in DP coordination in other ways (e.g. it is a PP in clausal coordination but a DP in DP coordination; it has three words in clausal coordination but two in DP coordination), even if we do find a prosodic difference between clausal and DP coordinations, we cannot attribute it to ellipsis definitively, unless we compare that prosodic difference with a difference between structures without ellipsis, as we did with the first conjunct in the experiment.

But I still report the results about that left boundary before and based on the vowel quality for two reasons. First, the results are at least consistent with the hypothesis that elided material has prosodic representation. Second, the results suggest that vowel reduction in and can be a measure of boundary strength in coordination.

I used the vowel reduction in and as the auxiliary measure, and used the segmental transcriptions by the Montreal Forced Aligner. Figure 33 shows the distribution of the vowel in and.

Figure 33
Figure 33

Proportions of utterances that had and reduction and those that did not.

I ran a binomial generalized linear mixed effects model on the utterances that produced the vowel in and. The dependent variable of this model was binary–whether the vowel in and is [æ] or [ə], and the fixed variables were coordination and condition. The model included random intercepts and slopes by speaker and item group where those effects didn’t result in a singular fit.

Within the Control Condition, the vowel in and is 47.4% more likely to be reduced to [ə] in DP coordination than in clausal coordination (p < 0.05; Figure 34). Within the Critical Condition, the vowel in and is 54.0% more likely to be reduced to [ə] in DP coordination than in clausal coordination (p < 0.001; Figure 34). There is no significance in the interaction between coordination type and condition type–the differences in likelihood of vowel reduction within the Critical Condition are not significantly different from those within the Control Condition (i.e., no difference between the differences Δ1 and Δ2).

Figure 34
Figure 34

Probability that the vowel in and is reduced to [ə].

The results based on vowel reduction suggested that when the second conjunct contains elided material, it induces a larger prosodic boundary before and. This is consistent with the hypothesis that elided material is represented in syntax and prosody. Furthermore, the results suggest that the vowel of and is more likely to be reduced when it starts a weaker prosodic domain.

10 Conclusion

This paper has argued with experimental results that elided material affects prosody, despite being silent. Elided material may be mapped onto the prosodic structure, and surrounded by prosodic boundaries just like pronounced material.

Following a derivational view of the syntax-prosody mapping, the findings suggest that elided material must be present in syntactic structure to begin with. Phonological deletion of this material takes place after the creation of prosodic boundaries, so that at the point of prosodification, elided material is still present.

If previous findings were correct that other silent material does not have prosodic representation (e.g., Nespor & Vogel 1986; Chen 1987; Lin 1994; Truckenbrodt 1999; Elfner 2012; 2015; Güneş 2015), then my results here suggest a dichotomy of silence, with elided material having prosodic representation on the one hand, and null heads and their projections (and perhaps traces) not having prosodic representation on the other.

My findings are compatible with the following order of operations: Vocabulary Insertion precedes creation of prosodic structure (as was claimed by the theories in the Distributed Morphology tradition, e.g. Halle & Marantz (1993); Embick & Noyer (2007), which then precedes deletion of elided material, so that prosody knows which heads are lexically silent and should be ignored, and at the point of creation of prosodic structure, elided material has not been fully deleted yet. Recent developments in the syntax-prosody mapping have argued for a two-step derivation process: the morphosyntactic output (MSO) is spelled out to the phonological input (PI) following the principles of Vocabulary Insertion and Match Theory; the PI is then subject to further addition of phonologically predictable properties such as prosodic prominence, and reorganization induced by phonological constraints such as size constraints, leading to the phonological output (PO) (Kratzer & Selkirk 2020; Elordieta & Selkirk 2022; Lee & Selkirk 2022). A reviewer suggested that under this elaborated view of the syntax-prosody interface, perhaps elided material gets deleted at the PI-PO interface, and this deletion process follows Vocabulary Insertion and creation of the prosodic structure, which occur at the MSO-PI interface. Since this elaborated view of the syntax-prosody interface puts prominence assignment in the PI-PO interface, I speculate that phonological deletion of elided material occurs at the PI-PO interface because ellipsis is ultimately a type of prominence assignment–elided material has the most extreme form of prominence deprivation (Tancredi 1992), to the extent that it cannot have any prominence, even on the syllable-level, leading to its phonological “deletion”. This is what sets elided material apart from phonologically null heads and their projections: the former is the result of prominence assignment, while the latter the result of Vocabulary Insertion.

This study also demonstrates the value of experimental methods in understanding theories of ellipsis. Theories make concrete testable predictions about prosody, which are borne out by subtle effects in prosodic boundaries. These boundary effects are so subtle that they may not be detectable impressionistically, but only by careful phonetic measures such as durations. Furthermore, one of the key effects is an interaction term–a durational difference between differences–which can only be tested by measurements in lab settings and statistical analyses.

Data availability

You can find the test items for the experiment, including the context, question and answer at this link: https://doi.org/10.16995/glossa.11169.s1.

Notes

  1. Technically, (1) and (2) are not a strict minimal pair because they have different numbers of syllables. Neither are (3B) and (2) a minimal pair, so we cannot really compare them directly. These examples are used only to introduce and illustrate the research question. The experiment to be presented later controls for factors like this, and compares strict minimal pairs. [^]
  2. There have been many different syntactic analyses of coordination, such as complementation, where the second conjunct is the complement of the coordination head (e.g. Munn 1987; Collins 1988; Zoerner 1995; Johannessen 1998; Kayne 1998; de Vries 2005); or adjunction, where the second conjunct is an adjunct to the first (e.g. Munn 1992; 1993; Bošović and Franks 2000; Hartmann 2000; Zhang 2010); or mutual adjunction (e.g. Cormack and Smith 2005; Philip 2012; Al Khalaf 2015; Wu 2022; Neeleman et al. 2023). All these syntactic analyses make the same prediction for our purposes because they would all assign the following constituency to coordination: [Conjunct1 [Conj0 Conjunct2]], where the first conjunct crucially is its own XP. For concreteness, this paper follows the complementation approach where the second conjunct is the complement of the Conj0. [^]
  3. Match Theory was actually based on Optimality Theory, a framework in which mapping principles may not be followed if there are higher-ranked constraints. Based on empirical observations, I assume that at least in English coordination, the mapping principles are ranked highly enough that they are always followed, and thus skip the OT ranking exercises for simplicity. [^]

Ethics and consent

The study was approved by the Committee on the Use of Humans as Experimental Subjects of the Massachusetts Institute of Technology and the Central University Research Ethics Committee (CUREC) of the University of Oxford. The study was performed in accordance with the ethical standards as laid down in the Federal Regulations for the Protection of Human Subjects (45 CFR 46) and its later amendments.

Acknowledgements

This paper arises from experimental research funded by the John Fell Oxford University Press (OUP) Research Fund. I would like to thank Athulya Aravind, Edward Flemming, Danny Fox, David Pesetsky, Michael Wagner, participants of the ECBAE Workshop, McGill Syntax & Semantics Reading Group, NELS 49, Stanford Syntax and Morphology Circle, UCSC S-Circle, and ‘You’re on mute’ ellipsis seminar series, as well as the anonymous reviewers, for helpful comments. Thanks are also due to all the participants of the production study. All errors are my own.

Competing Interests

The author has no competing interests to declare.

References

Al Khalaf, Eman. 2015. Coordination and linear order. University of Delaware dissertation. Retrieved from http://udspace.udel.edu/handle/19716/17494

Beckman, Mary E. & Pierrehumbert, Janet B. 1986. Intonational structure in Japanese and English. Phonology Yearbook 3. 255–309. DOI:  http://doi.org/10.1017/S095267570000066X

Bellik, Jennifer & Ito, Junko & Kalivoda, Nick & Mester, Armin. 2022. Matching and alignment. In Kubozono, Haruo & Ito, Junko & Mester, Armin (eds.), Prosody and prosodic interfaces, 1st ed., 457–480. Oxford University Press. DOI:  http://doi.org/10.1093/oso/9780198869740.003.0015

Bennett, Ryan & Elfner, Emily & McCloskey, James. 2016. Lightest to the right: An apparently anomalous displacement in Irish. Linguistic Inquiry 47(2). 169–234. DOI:  http://doi.org/10.1162/LING_a_00209

Bošović, Željko & Franks, Steven. 2000. Across-the-board movement and LF. Syntax 3(2). 107–128. DOI:  http://doi.org/10.1111/1467-9612.00027

Chen, Matthew Y. 1987. The syntax of Xiamen tone sandhi. Phonology Yearbook 4(1). 109–149. DOI:  http://doi.org/10.1017/S0952675700000798

Cheng, Lisa Lai-Shen & Downing, Laura J. 2021. Recursion and the definition of universal prosodic categories. Languages 6(3). 125. DOI:  http://doi.org/10.3390/languages6030125

Chung, Sandra & Ladusaw, William A. & McCloskey, James. 1995. Sluicing and logical form. Natural Language Semantics 3(3). 239–282. DOI:  http://doi.org/10.1007/BF01248819

Collins, Chris. 1988. Conjunction adverbs. MIT.

Cormack, Annabel & Smith, Neil. 2005. What is coordination? Lingua 115(4). 395–418. DOI:  http://doi.org/10.1016/j.lingua.2003.09.008

de Pijper, Jan Roelof & Sanderman, Angelien A. 1994. On the perceptual strength of prosodic boundaries and its relation to suprasegmental cues. The Journal of the Acoustical Society of America 96(4). 2037–2047. DOI:  http://doi.org/10.1121/1.410145

de Vries, Mark. 2005. Coordination and syntactic hierarchy. Studia Linguistica 59(1). 83–105. DOI:  http://doi.org/10.1111/j.1467-9582.2005.00121.x

Downing, Bruce T. 1970. Syntactic structure and phonological phrasing in English. The University of Texas at Austin dissertation.

Elfner, Emily. 2012. Syntax-prosody interactions in Irish. University of Massachusetts, Amherst dissertation.

Elfner, Emily. 2015. Recursion in prosodic phrasing: evidence from Connemara Irish. Natural Language & Linguistic Theory 33(4). 1169–1208. DOI:  http://doi.org/10.1007/s11049-014-9281-5

Elordieta, Gorka. 2015. Recursive phonological phrasing in Basque. Phonology 32(1). 49–78. DOI:  http://doi.org/10.1017/S0952675715000044

Elordieta, Gorka & Selkirk, Elisabeth. 2022. Unaccentedness and the formation of prosodic structure in Lekeitio Basque. In Kubozono, Haruo & Ito, Junko & Mester, Armin (eds.), Prosody and prosodic interfaces, 1st ed., 374–419. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oso/9780198869740.003.0013

Embick, David & Noyer, Rolf. 2007. Distributed Morphology and the syntax—morphology interface. In Ramchand, Gillian & Reiss, Charles (eds.), The Oxford handbook of linguistic interfaces, 289–324. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199247455.013.0010

Féry, Caroline & Truckenbrodt, Hubert. 2005. Sisterhood and tonal scaling. Studia Linguistica 59(2–3). 223–243. DOI:  http://doi.org/10.1111/j.1467-9582.2005.00127.x

Gee, James Paul & Grosjean, François. 1983. Performance structures: A psycholinguistic and linguistic appraisal. Cognitive Psychology 15(4). 411–458. DOI:  http://doi.org/10.1016/0010-0285(83)90014-2

Ginzburg, Jonathan & Sag, Ivan A. 2001. Interrogative investigations: The form, meaning, and use of English interrogatives. Stanford, Calif: CSLI Publications.

Griffiths, James. 2019. A Q-based approach to clausal ellipsis: Deriving the preposition stranding and island sensitivity generalisations without movement. Glossa: A Journal of General Linguistics 4(1). DOI:  http://doi.org/10.5334/gjgl.653

Groenendijk, Jeroen & Stokhof, Martin. 1984. Studies on the semantics of questions and the pragmatics of answers. University of Amsterdam dissertation.

Güneş, Güliz. 2014. Constraints on syntax-prosody correspondence: The case of clausal and subclausal parentheticals in Turkish. Lingua 150. 278–314. DOI:  http://doi.org/10.1016/j.lingua.2014.07.021

Güneş, Güliz. 2015. Deriving prosodic structures. University of Groningen dissertation.

Halle, Morris & Marantz, Alec. 1993. Distributed morphology and the pieces of inflection. In Hale, Kenneth & Keyser, S. Jay (eds.), The view from building 20, 111–176. The MIT Press.

Hamlaoui, Fatima & Szendrői, Kriszta. 2015. A flexible approach to the mapping of intonational phrases. Phonology 32(1). 79–110. DOI:  http://doi.org/10.1017/S0952675715000056

Hamlaoui, Fatima & Szendrői, Kriszta. 2017. The syntax-phonology mapping of intonational phrases in complex sentences: A flexible approach. Glossa: A Journal of General Linguistics 2(1). DOI:  http://doi.org/10.5334/gjgl.215

Hartmann, Katharina. 2000. Right node raising and gapping: interface conditions on prosodic deletion. Philadelphia, PA: J. Benjamins. DOI:  http://doi.org/10.1075/z.106

Ishihara, Shinichiro. 2022. On the (lack of) correspondence between syntactic clauses and intonational phrases. In Kubozono, Haruo & Ito, Junko & Mester, Armin (eds.), Prosody and prosodic interfaces, 1st ed., 420–456. Oxford, UK: Oxford University Press. DOI:  http://doi.org/10.1093/oso/9780198869740.003.0014

Ishihara, Shinichiro & Myrberg, Sara. 2023. Match Theory and the Asymmetry Problem: An example from Stockholm Swedish. Languages 8(2). 102. DOI:  http://doi.org/10.3390/languages8020102

Ito, Junko & Mester, Armin. 2007. Prosodic adjunction in Japanese compounds. MIT Working Papers in Linguistics 55. 97–111.

Ito, Junko & Mester, Armin. 2010. The onset of the prosodic word. In Parker, Steve (ed.), Phonological argumentation: Essays on evidence and motivation, 227–260. London: Equinox. DOI:  http://doi.org/10.3138/9781845532215.009

Ito, Junko & Mester, Armin. 2012. Recursive prosodic phrasing in Japanese. In Borowsky, Toni & Kawahara, Shigeto & Shinya, Takahito & Sugahara, Mariko (eds.), Prosody matters: Essays in honor of Elisabeth Selkirk, 280–303. Equinox. DOI:  http://doi.org/10.3138/9781845536770.009

Ito, Junko & Mester, Armin. 2013. Prosodic subcategories in Japanese. Lingua 124. 20–40. DOI:  http://doi.org/10.1016/j.lingua.2012.08.016

Ito, Junko & Mester, Armin. 2015. The perfect prosodic word in Danish. Nordic Journal of Linguistics 38(1). 5–36. DOI:  http://doi.org/10.1017/S0332586515000049

Jabeen, Farhat. 2022. Word order, intonation, and prosodic phrasing: Individual differences in the production and identification of narrow and wide focus in Urdu. Languages 7(2). 103. DOI:  http://doi.org/10.3390/languages7020103

Jacobson, Pauline. 2016. The short answer: Implications for direct compositionality (and vice versa). Language 92(2). 331–375. DOI:  http://doi.org/10.1353/lan.2016.0038

Johannessen, Janne Bondi. 1998. Coordination Reprint. New York and Oxford: Oxford University Press.

Johnson, Kyle. 2001. What VP ellipsis can do, and what it can’t, but not why. In Baltin, Mark & Collins, Chris (eds.), The handbook of contemporary syntactic theory, 1st ed., 439–479. Wiley. DOI:  http://doi.org/10.1002/9780470756416.ch14

Jun, Sun-Ah. 2000. K-ToBI (Korean ToBI) labelling conventions, version 3. Speech Sciences 7. DOI:  http://doi.org/10.21437/ICSLP.2000-515

Jun, Sun-Ah. 2003. The effect of phrase length and speech rate on prosodic phrasing. In Solé, Maria-Josep & Recasens, Daniel & Romero, Joaquin (eds.), Proceedings of the 15th International Congress of Phonetic Sciences.

Kayne, Richard S. 1998. The antisymmetry of syntax. Cambridge, MA: MIT Press.

Kratzer, Angelika & Selkirk, Elisabeth. 2020. Deconstructing information structure. Glossa: A Journal of General Linguistics 5(1). DOI:  http://doi.org/10.5334/gjgl.968

Krivokapić, Jelena. 2023. The prosodic hierarchy: Structure and performance. Keynote talk presented at the 30mfm, Manchester, UK.

Kubozono, Haruo. 1989. Syntactic and rhythmic effects on downstep in Japanese. Phonology 6(1). 39–67. DOI:  http://doi.org/10.1017/S0952675700000944

Kubozono, Haruo. 1992. Modeling syntactic effects on downstep in Japanese. In Docherty, Gerard J. & Ladd, D. Robert (eds.), Gesture, segment, prosody, 368–397. Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511519918.016

Kügler, Frank. 2022. Phrase-level ATR vowel harmony in Anum—A case of recursive prosodic phrasing. Languages 7(4). 308. DOI:  http://doi.org/10.3390/languages7040308

Ladd, D. Robert. 1986. Intonational phrasing: The case for recursive prosodic structure. Phonology Yearbook 3. 311–340. DOI:  http://doi.org/10.1017/S0952675700000671

Ladd, D. Robert. 1988. Declination ‘“reset”’ and the hierarchical organization of utterances. The Journal of the Acoustical Society of America 84(2). 530–544. DOI:  http://doi.org/10.1121/1.396830

Landau, Idan. 2023. Argument ellipsis as external merge after transfer. Natural Language & Linguistic Theory 41(2). 793–845. DOI:  http://doi.org/10.1007/s11049-022-09552-3

Lee, Seunghun J. & Riedel, Kristina. 2023. Recursivity and focus in the prosody of Xitsonga DPs. Languages 8(2). 150. DOI:  http://doi.org/10.3390/languages8020150

Lee, Seunghun J. & Selkirk, Elisabeth. 2022. Xitsonga tone: The syntax–phonology interface. In Kubozono, Haruo & Ito, Junko & Mester, Armin (eds.), Prosody and prosodic interfaces, 1st ed., 337–373. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oso/9780198869740.003.0012

Lin, Jo-wang. 1994. Lexical government and tone group formation in Xiamen Chinese. Phonology 11(2). 237–275. DOI:  http://doi.org/10.1017/S0952675700001962

McAuliffe, Michael & Socolof, Michaela & Mihuc, Sarah & Wagner, Michael & Sonderegger, Morgan. 2017. Montreal forced aligner: Trainable text-speech alignment using Kaldi. In Interspeech 2017, 498–502. ISCA. DOI:  http://doi.org/10.21437/Interspeech.2017-1386

McAuliffe, Michael & Sonderegger, Morgan. 2024. English (US) ARPA acoustic model (Version 3.0.0). Retrieved from https://mfa-models.readthedocs.io/acoustic/English/English(US)ARPAacousticmodelv3_0_0.html

Merchant, Jason. 2001. The syntax of silence: Sluicing, islands, and the theory of ellipsis. New York and Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oso/9780199243730.001.0001

Merchant, Jason. 2005. Fragments and ellipsis. Linguistics and Philosophy 27(6). 661–738. DOI:  http://doi.org/10.1007/s10988-005-7378-3

Munn, Alan. 1987. Coordinate structure and X-bar theory. McGill Working Papers in Linguistics 4(1). 121–140.

Munn, Alan. 1992. A null operator analysis of ATB gaps. The Linguistic Review 9(1). 1–26. DOI:  http://doi.org/10.1515/tlir.1992.9.1.1

Munn, Alan. 1993. Topics in the syntax and semantics of coordinate structures. University of Maryland, College Park dissertation.

Neeleman, Ad & Philip, Joy & Tanaka, Misako & van de Koot, Hans. 2023. Subordination and binary branching. Syntax 26(1). 41–84. DOI:  http://doi.org/10.1111/synt.12244

Nespor, Marina & Vogel, Irene. 1986. Prosodic phonology. Dordrecht, Holland; Riverton, N.J., U.S.A: Foris.

Philip, Joy. 2012. Subordinating and coordinating linkers. University College London dissertation.

Pierrehumbert, Janet Breckenridge & Beckman, Mary E. 1988. Japanese tone structure. Cambridge, Mass: MIT Press.

Price, Patti J. & Ostendorf, Mari & Shattuck-Hufnagel, Stefanie & Fong, Cynthia. 1991. The use of prosody in syntactic disambiguation. The Journal of the Acoustical Society of America 90(6). 2956–2970. DOI:  http://doi.org/10.1121/1.401770

Selkirk, Elisabeth. 1981. On prosodic structure and its relation to syntactic structure. In Fretheim, Thorstein (ed.), Nordic prosody, 111–140.

Selkirk, Elisabeth. 1984. Phonology and syntax: The relation between sound and structure. Cambridge (Mass.): MIT press.

Selkirk, Elisabeth. 1986. On derived domains in sentence phonology. Phonology Yearbook 3. 371–405. DOI:  http://doi.org/10.1017/S0952675700000695

Selkirk, Elisabeth. 1995. The prosodic structure of function words. In Beckman, Jill & Dickey, Laura Walsh & Urbanczyk, Suzanne (eds.), Papers in Optimality Theory, 439–470. Amherst, MA: GLSA Publications.

Selkirk, Elisabeth. 2000. The interaction of constraints on prosodic phrasing. In Horne, Merle (ed.), Prosody: Theory and experiment 14. 231–261. Dordrecht: Springer Netherlands. DOI:  http://doi.org/10.1007/978-94-015-9413-4_9

Selkirk, Elisabeth. 2009. On clause and intonational phrase in Japanese: The syntactic grounding of prosodic constituent structure. Gengo Kenkyuu 136. 35–73.

Selkirk, Elisabeth. 2011. The syntax-phonology interface. In Goldsmith, John & Riggle, Jason & Yu, Alan C. L. (eds.), The handbook of phonological theory, 1st ed., 435–484. Wiley. DOI:  http://doi.org/10.1002/9781444343069.ch14

Selkirk, Elisabeth & Lee, Seunghun J. 2015. Constituency in sentence phonology: An introduction. Phonology 32(1). 1–18. DOI:  http://doi.org/10.1017/S0952675715000020

Tancredi, Christopher. 1992. Deletion, deaccenting and presupposition. Massachusetts Institute of Technology dissertation.

Truckenbrodt, Hubert. 1995. Phonological phrases: Their relation to syntax, focus, and prominence. Massachusetts Institute of Technology dissertation.

Truckenbrodt, Hubert. 1999. On the relation between syntactic phrases and phonological phrases. Linguistic Inquiry 30(2). 219–255. DOI:  http://doi.org/10.1162/002438999554048

van Craenenbroeck, Jeroen. 2009. The syntax of ellipsis: Evidence from Dutch dialects 1st ed. New York: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780195375640.001.0001

Wagner, Michael. 2005. Prosody and recursion. Massachusetts Institute of Technology dissertation.

Wagner, Michael. 2010. Prosody and recursion in coordinate structures and beyond. Natural Language & Linguistic Theory 28(1). 183–237. DOI:  http://doi.org/10.1007/s11049-009-9086-0

Watson, Duane & Gibson, Edward. 2004. The relationship between intonational phrasing and syntactic structure in language production. Language and Cognitive Processes 19(6). 713–755. DOI:  http://doi.org/10.1080/01690960444000070

Wightman, Colin W. & Shattuck-Hufnagel, Stefanie & Ostendorf, Mari & Price, Patti J. 1992. Segmental durations in the vicinity of prosodic phrase boundaries. The Journal of the Acoustical Society of America 91(3). 1707–1717. DOI:  http://doi.org/10.1121/1.402450

Wu, Danfeng. 2022. Syntax and prosody of coordination. Massachusetts Institute of Technology dissertation.

Zhang, Niina Ning. 2010. Coordination in syntax. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511770746

Zoerner, Cyril Edward. 1995. Coordination: The syntax of &P. University of California, Irvine dissertation.