A- A+
Alt. Display

# Correlate not optional: PP sprouting and parallelism in “much less” ellipsis

## Abstract

Clauses that are parallel in form and meaning show processing advantages in ellipsis and coordination structures (Frazier et al. 1984; Kehler 2000; Carlson 2002). However, the constructions that have been used to show a parallelism advantage do not always require a strong semantic relationship between clauses. We present two eye tracking while reading studies on focus-sensitive coordination structures, an understudied form of ellipsis which requires the generation of a contextually salient semantic relation or scale between conjuncts. However, when the remnant of ellipsis lacks an overt correlate in the matrix clause and must be “sprouted” in the ellipsis site, the relation between clauses is simplified to entailment. Instead of facilitation for sentences with an entailment relation between clauses, our online processing results suggest that violating parallelism is costly, even when doing so could ease the semantic relations required for interpretation.

Keywords:
How to Cite: Harris, J. A., & Carlson, K. (2019). Correlate not optional: PP sprouting and parallelism in “much less” ellipsis. Glossa: A Journal of General Linguistics, 4(1), 83. DOI: http://doi.org/10.5334/gjgl.707
Published on 22 Jul 2019
Accepted on 18 Apr 2019            Submitted on 31 May 2018

## 1 Introduction

The online interpretation of ellipsis structures has become a popular topic among psycholinguists because it highlights an intriguing mismatch between form and meaning, and consequently reveals a unique demand on the human sentence processing system. In particular, that a meaningful interpretation is recovered from ellipsis shows that it is not enough for the processor to simply parse linguistic structure by passively interpreting the word forms presented to it; instead, the processor must actively go beyond the input and infer the correct form at the appropriate level of representation, e.g., syntactic, Logical Form, discourse, etc. We aim to expand the empirical and conceptual coverage of ellipsis processing by exploring an understudied ellipsis type known as focus-sensitive coordination, which requires a contextually salient scale between contrasting phrases. Here we use that scale to explore the role of parallelism in the processing of elided structures.

Although there are many types of ellipsis structures, perhaps the most well known case is VP (verb phrase) ellipsis, which we use to illustrate the basic inference problem faced by the human language sentence processor. For example, in (1a) the auxiliary did stands in for the verb phrase ate a cheeseburger, which is made explicit in (1b).

 (1) a. John ate a cheeseburger. Bill did, too. b. John ate a cheeseburger. Bill ate a cheeseburger, too.

To interpret (1a) as (1b), the processor must therefore “fill in” the missing or elided material, presumably by consulting linguistic or discourse representations from the context. Current research on ellipsis suggests that the processor engages in some kind of cost-free mechanism, retrieving the missing form through either copying (Frazier & Clifton 2001, 2005; Frazier 2008) or a content-addressable pointer in memory (Martin & McElree 2008, 2009, 2011; Martin 2010). For concreteness, we assume that a syntactic or logical structure is present covertly at the ellipsis site (Shapiro & Hestvik 1995; Merchant 2001; Shapiro et al. 2003; Frazier 2008) at some level of representation (discourse, syntactic, Logical Form, etc.). Although we believe that the assumption of covert syntax is supported by previous research, we acknowledge that other, non-syntactic accounts are currently live options in the literature which deserve serious attention (e.g., Sag 1976; Dalrymple et al. 1991; Hardt 1993; Ginzburg & Sag 2000; Culicover & Jackendoff 2005; Nykiel and Sag 2011). Our account in no way critically hinges upon or supports this assumption, which is made primarily for convenience, and our processing account could have been formalized in non-syntactic terms. In any event, we speak as though structure from the antecedent clause is interpreted at the ellipsis site (2), represented by < >, and is recovered within the ellipsis site (see Merchant 2016 for a review).

 (2) John ate a cheeseburger. Bill did , too.

If the problem of recovering the ellipsis weren’t thorny enough, the processor may also need to infer that additional material or content is missing. In cases of clausal ellipsis like sluicing (Ross 1967, 1969), the remnant of ellipsis, e.g., the interrogative pronoun what in (3), is the overt material remaining from a clause that was elided. In the standard case of merger, the remnant is directly paired with a correlate (something) in the same syntactic position within the antecedent clause. However, Chung, Ladusaw & McCloskey (1995, 2011) also identify cases of sluicing in which a remnant has no overt correlate, cases they call sprouting (3b).

 (3) a. John ate something, but I don’t know [CP what1 ]. b. John ate, but I don’t know [CP what1 ]

While there are various syntactic accounts of sluicing (e.g., Chung et al. 1995; Merchant 2001; van Craenenbroeck 2010), many approaches converge on the idea that the elided clause (John ate t1) is syntactically or semantically identical to the antecedent clause at some level of representation. An overt indefinite correlate (3a) would provide a variable (corresponding to the phonetically null trace t1) for the wh-element to bind within the elided phrase. When no correlate is present, as in (3b), a variable must be created within the ellipsis site – that is, a variable is “sprouted” at Logical Form (LF) that does not correspond to any overt material in the matrix clause. In other words, sprouting ensures that the form of the ellipsis clause of (3b) is equivalent to that of (3a), even though the actual antecedent clause in (3b) is not parallel with the ellipsis.

Processing sprouting has been shown to elicit online processing costs for sluicing ellipsis (Frazier & Clifton 1998; Dickey & Bunger 2011), indicating that the processor seeks to identify a correlate for the remnant of ellipsis from the surface form of the antecedent clause. Frazier & Clifton (1998) compared sprouting of two kinds of elements: the argument to a verb (something; 4a) and an adjunct (somewhere; 4b), following previous research (Carlson & Tanenhaus 1988; Mauner, Tanenhaus & Carlson 1995) suggesting that implicit arguments, but not implicit adjuncts, are inferred at LF. Although cases of sprouting were costly (i.e., when something or somewhere were absent), no processing time differences between sprouted arguments and adjuncts were observed in their self-paced reading study.

 (4) a. Argument: The secretary typed (something), but I don’t know what. b. Adjunct: The secretary typed (somewhere), but I don’t know where.

However, processing a conjunction structure is independently facilitated when the conjuncts are parallel in their syntactic, semantic, or prosodic structure (e.g., Frazier et al. 1984; Henstra 1996; Frazier, Munn & Clifton 2000; Knoeferle & Crocker 2009; Sturt, Keller & Dubey 2010; Poirier, Walenski & Shapiro 2012; Knoeferle 2014). Dickey & Bunger (2011) argued that the processing cost attributed to sprouting can be best understood in terms of a general preference for structurally parallel structures rather than an operation specific to the construction of Logical Form or some other level of representation (cf. Chung et al. 1995). In a self-paced reading paradigm, they observed a processing cost for sprouting regardless of whether there was ellipsis in the second conjunct (5a) or not (5b), in addition to replicating the lack of differences between sprouted arguments and adjuncts (Frazier & Clifton 1998).

 (5) a. Elided: The secretary typed {something/quickly}, but I don’t know what exactly. b. Non-elided: The secretary typed {something/quickly}, but I don’t know what she typed.

The existing evidence thus suggests that structural parallelism1 between remnants and correlates could be a driving factor in processing ellipsis structures (see Carlson 2002 for discussion), while more general parallelism between clauses or phases is common in conjoined structures. However, most experimental studies have concentrated on better-known forms of ellipsis: VP ellipsis (e.g., Murphy 1985; Tanenhaus & Carlson 1990; Ward, Sproat & McKoon 1991; Shapiro & Hestvik 1995; Martin & McElree 2008; Shapiro et al. 2003, among others), gapping (Carlson 2001, 2002; Carlson, Dickey & Kennedy 2005), sluicing (e.g., Frazier & Clifton 1998, 2005; Carlson et al. 2009; Poirier et al. 2010; Martin & McElree 2011; Nykiel 2013; Harris 2015), and to a lesser extent stripping/replacive ellipsis (also known as bare argument ellipsis; see Paterson et al. 2007; Carlson 2013; Sauermann et al. 2013 for experimental findings). While these studies reveal a relatively unified view of processing ellipsis, less canonical forms of ellipsis with specialized constraints on the relation between the antecedent clause and the ellipsis might challenge our current understanding of how elliptical material is recovered and integrated into a representation in real time. We turn to focus-sensitive coordination, an ellipsis construction which imposes an additional relationship between the correlate and remnant: in this case, a contextually salient scale. In particular, we use the properties of that scale to explore the role of parallelism in the retrieval of correlates for remnants in elided structures.

### 1.1 Focus-sensitive coordination

Focus-sensitive coordination structures are a class of constructions that contain a coordinator like let alone, much less, and to some extent never mind, in the scope of explicit or implicit negation,2 which impose a suite of syntactic, pragmatic, and prosodic constraints on their conjuncts. Following current literature (Hulsey 2008; Toosarvandani 2010; Harris 2016), we assume that the second conjunct is the remnant of ellipsis, rather than a simple constituent. For example, Harris (2016) proposes that the overt conjunct vodka in (6a) corresponds to the remnant of ellipsis, which moves into a focus position FocP above the elided clause (I drink t1) as in (6b). This derivation is on par with other move-and-delete accounts of clausal ellipsis, such as those for stripping/bare argument ellipsis (Frazier et al. 2012; Sailor & Thoms 2014), sluicing (Merchant 2001), or fragment answers (Merchant 2004; Weir 2014).

 (6) a. I can’t drink beer, much less vodka. b. I can’t drink beer, much less [FocP vodka]1 .

Though not reviewed in detail here, the ellipsis account of focus-sensitive coordination structures is supported by many distributional tests that align it with so-called small conjunct ellipsis (see Hulsey 2008 for a gapping account, and Harris 2016 for a move-and-delete account similar to those proposed for stripping/bare argument ellipsis). Consequently, we will continue to use key terminology from the ellipsis literature to describe the components of the structure: a “remnant” (vodka) that represents the non-elided material in the clause containing the ellipsis, and a “correlate” (beer) in the antecedent clause that contrasts with the remnant.

Focus-sensitive coordination is similar to stripping/bare argument ellipsis and sluicing ellipsis in that the correlate and the remnant bear pitch accent in the most felicitous pronunciation, typically with accents that mark contrastive focus (Harris & Carlson 2018). In the examples below, contrastive accent is indicated with CAPS. Note that the contrastive element in the remnant corresponds to a contrastive correlate in the antecedent, regardless of whether the constituents are nouns (7a), verbs (7b), verb phrases (7c), or modifiers (7d).

 (7) a. I can’t drink BEER, much less VODKA. b. I can’t DRINK beer, much less MAKE it. c. I can’t DRINK BEER, much less MAKE WINE. d. I can’t drink ONE beer, much less TWO (beers).

However, research on a range of ellipsis structures has found that overt prosodic marking on a noun increases the likelihood of it being selected as a correlate, but does not entirely disambiguate the sentence (e.g., Carlson 2001; Carlson et al. 2009; Carlson, Frazier & Clifton 2009; Harris & Carlson 2018). Listeners still can and do choose unaccented nouns as correlates, potentially due to a preference for local correlates, or an expectation about the default position of focus. For example, Rooth (1992) noted that while contrastive accent is necessary on a remnant, it is not actually required on the correlate, though it could be useful in locating the intended correlate.

Finally, and perhaps most crucially for present purposes, the antecedent and the elided clause stand in a scalar relationship. In particular, the negative antecedent (I can’t drink beer) seems to strongly imply or contextually entail the clause containing the ellipsis, also negated (I can’t drink vodka for (7a)), intuitively evoking a scalar relationship between beer and vodka (Fillmore et al. 1988; Toosarvandani 2010). Two kinds of scales are often discussed in the literature. The first are conventionalized or lexicalized scales, in which closed-class elements are logically ordered via semantic entailment or informativity. For example, lexical elements like cardinal numbers <one, two, three, …>, modals <can, must> and quantifiers <none, some, all> form conventionalized scales via entailment (Horn 1972; Hirschberg 1985). Such scales are thought to be context-independent, as they generalize to many occasions of use (e.g., Stiller, Goodman & Frank 2011). To take (7d) as an example: if you drink two beers, then, in all situations, you have also drunk one. Therefore, the (negated) proposition expressed in the antecedent clause (I can’t drink one beer) entails the (negation) of the ellipsis clause with the elided content (I can’t drink two beers).

The second kind of scales are sometimes known as ad hoc scales, which are highly dependent on context or the conversation at hand, and one must therefore know a great deal about the specific situation in addition to world knowledge and perhaps speaker intention to interpret them correctly (e.g., Hirschberg 1985). For example, (7a) can be understood as implicating that if I am unlikely or lack the capacity to drink beer, I therefore also am unlikely or lack the capacity to drink vodka, given its greater potency. Similarly, (7c) somehow implies that drinking beer is an easier or more expected activity than making wine, and so if I cannot perform the former, I therefore cannot perform the latter, even though drinking beer and making wine are logically unrelated events (a teetotaling oenologist might strike us as unusual, but certainly not as contradictory).

It is currently unclear whether one kind of scale might be more difficult to recover than the other. Conventionalized scales might be psychologically privileged, in that they could be accessible in more general contexts than ad hoc relations. Such a view would be compatible with Levinson’s (2000) neo-Gricean account, in which generalized conventional implicatures constitute interpretive defaults, and are consequently more readily accessible than particularized conventional implicatures. In other words, the unconstrained nature of ad hoc scalar relations would make them less accessible to comprehenders constructing interpretations in real time. Recent experiments do not shed much light on the matter, though they have shown that children (Papafragou & Tantalou 2004) and adults (Katsos & Bishop 2011) may treat these two kinds of scales in very different ways (for more discussion, see Katsos & Cummins 2012).

Example (8) illustrates that the (contextually) weaker element in focus-sensitive coordination must be located in the first conjunct; some may precede all (8a), but in many dialects the reverse is semantically incoherent, though seemingly grammatical (8b). Of course, this restriction does not carry over to other conjunctions, even those like but and although which make similar contributions to the discourse (8c).

 (8) a. John didn’t steal SOME of the cookies, much less ALL of them. b. #John didn’t steal ALL of the cookies, much less SOME of them. c. John didn’t steal ALL of the cookies, but/although he did steal SOME of them.

The basic properties we assume for focus-sensitive coordination are summarized in (9).

 (9) Properties of focus-sensitive coordination 1. The second conjunct consists of the remnant of ellipsis; 2. The correlate and remnant are usually marked with contrastive focus; and 3. Propositions formed from the clauses containing the correlate and remnant are placed on a contextually salient scale.

We believe that these properties place a unique set of demands on the sentence processing system. As we shall see, they also allow us to formulate two potentially opposing hypotheses regarding the processing routines recruited to interpret ellipsis structures in real time. First, however, we articulate our assumptions regarding what basic information the processor needs in order to process focus-sensitive coordination generally.

### 1.2 Processing ellipsis and focus-sensitive coordination

As the literature on processing ellipsis structures is rapidly growing (e.g., Phillips & Parker 2014, for review), we will introduce only the bare essentials here. We assume that the processor must address three basic requirements (10) in order to resolve any remnant-based ellipsis, as discussed in Carlson & Harris (2017) for focus-sensitive coordination and Harris (2019) for sluicing ellipsis. Again, although we assume for concreteness that the structure of the ellipsis site is recovered from the context or from the antecedent site, this assumption is not required for the essentials of our account.

 (10) Basic tasks of the processor in ellipsis processing: 1. Parse the remnant of the ellipsis, i.e., construct the appropriate phrase structure for the remnant given the input. 2. Locate the correlate, if any, from the antecedent clause. 3. Construct or infer the elided phrase, e.g., by regenerating or copying a structure at Logical Form.

We use example (6) to illustrate the three tasks in more depth in (11). We assume that each process depends on the previous one. For instance, the parser must have established the basic syntactic category of the remnant (step 1) before it can locate a correlate of the appropriate type in the antecedent clause (step 2). Similarly, generating a structure for the elided clause (step 3) depends on having selected an appropriate correlate from the antecedent clause (step 2).3 Note that the parsing mechanisms needed for step 1 are standard, and are not processes unique to ellipsis.

 (11) I can’t drink beer, let alone vodka. Step 1. Parse the remnant: Assign the appropriate phrase structure for vodka. I can’t drink beer, much less [DP=Remnant vodka]. Step 2. Locate the correlate: Retrieve an appropriate correlate that provides a suitable contrast to the remnant vodka, using various processing strategies. I can’t drink [DP=Correlate beer ], much less [DP=Remnant vodka]. Step 3. Construct the elided phrase: Build the ellipsis structure after the remnant. I can’t drink [DP=Correlate beer ], much less [DP=Remnant vodka] 1 < I drink t1 >.

Although focus-sensitive coordination has been studied far less than other kinds of ellipsis structures, there are a few recent results that bear on the processes outlined above. Regarding step 1, Harris (2016) found a small bias for VP remnants over DP/NP remnants in a variety of offline completion tasks when the fragment after let alone appeared without context (replicated in Harris & Carlson 2016, 2018). However, simply mentioning a salient DP object in preceding text weakened or overturned the bias. Further, VP and DP remnants did not show any categorical differences in online reading in eye tracking. He argued that these results were consistent with an ellipsis account over a simple coordination account, and that the processor constructs the clause containing the remnant upon encountering a focus-sensitive coordinator. In a follow up study, Harris & Carlson (2016) showed that the bias towards VP remnants could not be attributed to exposure in text alone, as English corpus searches showed a general mild preference for DP, not VP, remnants. Following Toosarvandani (2010), they proposed that the remnant-correlate pair is sensitive to an accessible Question Under Discussion, i.e., the question or topic that immediate conversation is meant to address, explicitly or implicitly (Roberts 1996, 2012).

With respect to step 2, Harris & Carlson (2016, 2018) found that the processing strategies used to pair the remnant, once parsed, with a correlate were similar to those used in other ellipsis structures. In particular, the processor appears to prefer the most local (or nearest) correlate of the appropriate syntactic type, a preference that prevails in sluicing and most likely reflects the tendency for English focus to appear late in a clause (for arguments from sluicing, see Carlson, Dickey et al. 2009). The preference for the most local correlate is exhibited not only across a variety of written and spoken corpora, but also very early in the online processing record. The general patterns are thus compatible with an ellipsis account in which the processor seeks to make the antecedent and the ellipsis clauses, especially the contrasted elements, parallel along syntactic, semantic, and prosodic dimensions (Carlson 2001; Dickey & Bunger 2011).

What, then, is the processor to do when the clause containing the remnant and ellipsis is not parallel with the antecedent? Instances of sprouting discussed above represent an extreme case, in which the remnant lacks a constituent in the antecedent. As reviewed above, prior studies have found that sprouted correlates incur a processing cost during online comprehension, due either to an ellipsis-specific operation in which the ellipsis site is modified to include a variable corresponding to the correlate (Frazier & Clifton 1998), or else to a side effect of a bias against non-parallel antecedents (Dickey & Bunger 2011). Focus-sensitive coordination, however, raises an intriguing possibility regarding sprouting. In addition to finding a correlate for the remnant (step 2), the correlate and remnant must ultimately be put onto a contextually salient scale whose properties are inferred through context (step 3). Notably, in cases of focus-sensitive coordination, sprouted correlates could facilitate recovery of a contextually salient scale via an entailment relation between the antecedent and elided clause.

In example (12a), the processor presumably must pair the remnant chemistry with the overt correlate carpentry, and accommodate an appropriate scale, e.g., one which deems carpentry a less difficult subject than chemistry, creating an inference from not being able to study carpentry to not being able to study chemistry. Without an overt correlate (12b), however, the proposition obtained from the antecedent (that Michael is not able to study) necessarily entails the proposition obtained from the clause containing the remnant and the ellipsis (that Michael is not able to study chemistry).4

 (12) Focus-sensitive coordination a. Michael couldn’t study carpentry, much less chemistry. b. Michael couldn’t study, much less chemistry.

In sluicing ellipsis, however, the relationship between clauses is much different. In the case with sprouting (13b), no such entailment relationship holds between the antecedent clause (Michael studied something) and the clause corresponding to the ellipsis (what1 he studied t1). The relation is instead one of identity, in which the some aspect of the antecedent clause is unknown, unidentified, or unreported.

 (13) Sluicing ellipsis a. Michael studied something, but I don’t know what. b. Michael studied, but I don’t know what.

In the remainder of this paper, we explore two possibilities. On the one hand, it is in principle conceivable that the cost for sprouting is limited to sluicing and other types of ellipsis besides focus-sensitive coordination, in that, without a scalar relation between clauses, there would be no interpretive advantage for sprouting in these structures. That is, if ad hoc scales are costly to compute, then sprouting in focus-sensitive coordination, by virtue of providing an entailment relation between the clauses, might actually simplify the task of sentence processing. In particular, a ready-made scale might outweigh the general preference for parallelism, provided that such a scale makes the inference task easier, as in (14).

 (14) Scalar Advantage Principle (SAP): Avoid positing an ad hoc scale, if a conventionalized scale is readily accessible.

This possibility rests on the premise that ad hoc scales are costly to compute, at least compared to entailment relations. Although we have no direct evidence for or against the Scalar Advantage Principle, we considered it a very likely conceptual possibility. The principle was motivated in part by our intuition that interpreting sentences with lexicalized scales as in (7d) requires no special knowledge about discourse or the intentions of the speaker. In contrast, interpreting sentences like (7a) requires establishing a contextually salient relationship, and such relationships are, in principle, unbounded. Although our primary motivation for SAP is conceptual, we do not think it implausible in principle. For example, a branch of research in sentence processing and general cognition has concluded that processing resources are generally quite limited. As a consequence, some language processing tasks are shallow or incomplete (e.g., Ferreira et al. 2002; Sanford & Sturt 2002; Sanford & Graesser 2006, among many others). Assuming that conventionalized relations are more readily available than situation-specific or ad hoc relations, structures that do not require ad hoc scalar relationships might demand fewer processing resources. In other words, conventionalized relations might simply be less taxing to access, especially without sufficient discourse context. Of course, the validity of such a preference remains an empirical question.

On the other hand, it is possible that the processor always prefers a correlate that is maximally parallel to the remnant as a matter of course, regardless of the advantage to interpretation. We define parallelism as the presence of any of a number of similarities (morphological, prosodic, syntactic, semantic, etc.) between contrasting or conjoined phrases (Carlson 2002). On this view, we expect that parallelism between the correlate and remnant would facilitate processing focus-sensitive coordination (15), as in other cases of ellipsis processing (e.g., Carlson 2001; Carlson et al. 2009).

 (15) Parallel Contrast Principle (PCP): Prefer correlate-remnant pairs that are as parallel as possible.

As noted earlier, initial studies of parallelism mainly concentrated on unelided conjoined elements instead of ellipsis. Frazier et al. (1984) showed that conjoined sentences were read faster when the clauses were more similar to each other in a range of ways, from active/passive voice to the animacy of an object DP. Mauner, Tanenhaus, & Carlson (1995) also found that matching active/passive voice eased the processing of VP Ellipsis sentences. Frazier, Munn & Clifton (2000) showed that semantically similar but syntactically unlike conjoined phrases (prepositional phrases or PPs with Adverb Phrases, for example) were processed more slowly than syntactically parallel conjoined phrases. Henstra (1996) found a bias for conjoined DPs to match in definiteness and presence of modifiers.

In various studies of ambiguous ellipsis sentences, Carlson (2001, 2002; also Carlson et al. 2009) found that the interpretation of these sentences responded to DP similarity in many features, such as definiteness, DP form (names vs. pronouns vs. definite or indefinite phrases), and gender, with a processing preference for pairing more similar correlates and remnants. The effects held true across ellipsis sentences with the conjunction and, such as gapping or VP Ellipsis, but also ones which did not, such as comparatives and bare remnant ellipsis. In all of these sentence types, remnant phrases (usually DPs) left behind by ellipsis contrasted with correlates within the unelided antecedent clause. Similarities between potential correlates and the remnant (and lack of similarity with other potential correlates) influenced the likelihood of an interpretation in which the similar phrases contrasted. In auditory experiments, contrastive (L+H*) accents on a potential correlate and remnant (and not on other possible correlates), creating what could be called prosodic parallelism, also increased matching interpretations, though they did not disambiguate the sentences by any means. There are some remaining issues regarding the role of contrast in processing ellipsis: for example, Rooth (1992) observes that accents or even focus on the first member of a contrastively focused pair of phrases is not necessary, though Carlson’s work shows that accent placement can clearly aid processing (see Harris & Carlson 2018 for additional discussion).

We take parallelism to be an extra-grammatical factor that affects processing of ellipsis structure and conjoined structures, rather than a grammatical constraint. The most extreme nonparallel condition is one in which a remnant lacks a correlate entirely, as in sprouting examples of sluicing or focus-sensitive coordination structures, which are nonetheless entirely grammatical structures. We take these cases as a testing ground for comparing the predictions of the Scalar Advantage Principle and the Parallel Contrast Principle.

When it comes to sprouting in focus-sensitive coordination, the two principles make the opposite predictions. The Scalar Advantage Principle predicts an advantage for sprouting, and the Parallel Contrast Principle predicts a penalty for sprouting. There is already some evidence for the Parallel Contrast Principle for focus-sensitive coordination. Carlson & Harris (2017) considered cases of “zero-adjective contrast” in which the remnant DP contained an adjective like red without a corresponding adjective in the correlate (16). In a search of the Corpus of Contemporary American English (COCA; Davies 2008), zero-adjective contrast was found to be extremely rare in text (less than 1% of 1644 cases of much less ellipsis).

 (16) a. I don’t own [a hat], much less [a red one]. b. She will not argue with [a fool], much less [a money-hungry one]. (COCA)

A series of auditory and written questionnaires confirmed that zero-adjective contrast was dispreferred compared to examples with parallel DPs in naturalness rating tasks, and was avoided in sentence completion. Finally, in a self-paced reading study with items like (17), they observed a penalty immediately on DP remnants (an easy one) when the correlate lacked a corresponding adjective contrast (complex), but a penalty on VP remnants (burn one) when there was an adjective in the correlate. The result was interpreted as providing evidence against the Scalar Advantage Principle, as there was a clear cost for computing zero-adjective contrast, and the second result as an indication that the processor anticipates upcoming contrast, in which case the salient contrast evoked by complex would have initially misled the processor.

 (17) a. The chef didn’t overcook a (complex) meal, much less [DP an easy one], since he was trained by the very best. b. The chef didn’t overcook a (complex) meal, much less [VP burn one], since he was trained by the very best.

Despite the existing evidence that zero-adjective contrasts are costly to compute, the following set of experiments follows up on a number of remaining questions and concerns. First, we are cautious about analogizing the addition of an adjective to the remnant too closely with the established case of sprouting in sluicing. Where the analogy breaks down is that the remnant DP does in fact have a correlate DP in the antecedent clause in such cases, but the correlate simply lacks the expected sub-contrast within it. Thus, the processor might not need to posit a variable at logical form so much as readjust the kinds of scales or comparisons it can accommodate. Second, Carlson & Harris’ (2017) main focus was more on the distribution of remnants after the much less coordinator, and less on the online processing profile associated with computing zero-adjective contrast in focus-sensitive coordination. The present study utilizes eye tracking while reading in order to gain a more fine-grained picture of processing focus-sensitive coordination structures during natural reading, which allows us to identify possible tradeoffs when initially encountering the remnant and later measures associated with interpretation. Finally, the structure we are studying permits an additional test of correlate-remnant semantic mismatch, which forms a kind of intermediate case between parallel correlate-remnant pairs and sprouting. It also affords us the opportunity to determine how much subjects attended to the semantic and pragmatic compatibility between the correlate and the remnant.

We present three norming studies and two eye tracking studies below, each of which contain items based on the following pattern (18). The first three conditions have PP (prepositional phrase) remnants, whereas the last three conditions have VP remnants (18d–f), which both served as a statistical control and prohibited participants from anticipating a PP remnant. The first sentence (18a) illustrates PP sprouting, in which a PP remnant completely lacks a corresponding PP correlate (the No Matrix PP condition). The second sentence (18b) contains an overt PP correlate that matches the remnant along syntactic and pragmatic dimensions (both indicate time; the Compatible PP condition). The third (18c) provides a case of moderate correlate-remnant mismatch (the Incompatible PP condition). While the correlate is of the same syntactic type, a PP, it doesn’t permit comparison along a comparable scale (the remnant is about time while the supposed correlate is about location). Bold formatting was added here and elsewhere for clarity of exposition, but did not appear in the experiment.

 (18) PP remnant a. John doesn’t want to eat out, much less on Tuesday, so I guess we’ll be staying home. b. John doesn’t want to eat out on Saturday, much less on Tuesday, so I guess we’ll be staying home. c. John doesn’t want to eat out at a steakhouse, much less on Tuesday, so I guess we’ll be staying home. VP remnant d. John doesn’t want to eat out, much less go dancing, so I guess we’ll be staying home. e. John doesn’t want to eat out on Saturday, much less go dancing, so I guess we’ll be staying home. f. John doesn’t want to eat out at a steakhouse, much less go dancing, so I guess we’ll be staying home.

The second eye tracking study explores whether the effects of PP sprouting are mitigated in supporting contexts, comparing the effect of sprouting cases like (18a) over controls (18b) with and without contexts introducing the PP remnant. In all, the results support the Parallel Contrast Principle, in that readers rely heavily on the form of the antecedent clause to identify suitable correlates, and that this process persists even in contexts supporting the PP remnant.

## 2 Experiment 1: PP sprouting

### 2.1 Experiment 1A: Completion study

#### 2.1.1 Materials and method

Materials were created by truncating the sentences to be used in the first eye tracking study (22) after much less, as shown in (19), along with 6 additional items containing much less ellipsis. A full list of items is in Appendix A. Participants were instructed to complete sentence fragments with the first natural word or short phrase that came to mind.

 (19) a. John doesn’t want to eat out, much less _____________. b. John doesn’t want to eat out on Saturday, much less _____________. c. John doesn’t want to eat out at a steakhouse, much less _____________.

The 30 experimental items were interspersed with 48 fragments from unrelated experiments, 15 non-experimental filler items, and 5 highly constrained fill-in-the blank fillers (e.g., Abe had finally had enough. But ______ was little else he could do.) for a total of 98 items per subject. Fragments were presented in counterbalanced order and were individually randomized for each participant.

#### 2.2.3 Results and discussion

The data are presented in Table 2 in both raw and normalized form. The data was normalized by taking the centered z-score from only the ratings in the experiment to clearly illustrate the directions of the effects between condition; see Figure 1. Subjects rated experimental items as very natural overall, at or above 5 on a 7-point Likert scale on average. The raw scores were subjected to a linear mixed effects regression model with planned contrasts of Matrix and Remnant Type as fixed effects and by-subjects and by-items random intercepts. Fixed effects were given deviation coding, with reference levels of Compatible PP for the Matrix factor, and PP Remnant for the Remnant Type factor. Although we report data from only the 24 items that also appeared in the eye tracking experiment that follows, results with all 30 items were qualitatively identical. See Table 3 for the statistical analysis.

Matrix PP remnant VP remnant PP penalty

No Matrix PP 5.00/–0.42 (0.14) 5.63/0.03 (0.13) 0.63
Compatible PP 5.89/0.22 (0.11) 5.71/0.09 (0.12) –0.18
Incompatible PP 5.46/–0.09 (0.13) 5.84/0.18 (0.11) 0.38

Table 2

Experiment 1B: Naturalness ratings. Uncorrected/z-score normalized means. Standard errors are in parentheses.

Parameter Estimate Std. Error t-value

(Intercept) 5.589 0.156 35.86*
No Matrix PP –0.26 0.059 –4.41*
Incompatible PP 0.044 0.059 0.74
PP Remnant –0.145 0.042 –3.47*
Incompatible PP × PP Remnant –0.056 0.059 –0.95
No Matrix PP × PP Remnant –0.157 0.059 –2.66*

Table 3

Experiment 1B: Naturalness ratings. Linear mixed effects regression model. Parameters with t-values above |2| were considered significant and are marked with an*.

Items from the No Matrix PP condition (M = 5.32, SE = 0.10) were, on the whole, rated less natural than those from the Compatible PP reference level (M = 5.80, SE = 0.08), t = –4.41, which, in turn, did not differ from the Incompatible PP items (M = 5.65, SE = 0.09). In addition, items with PP Remnants (M = 5.45, SE = 0.08) were rated as less natural than those with VP Remnants (M = 5.73, SE = 0.07), t = –3.62. More importantly, however, there was a differential penalty for PP Sprouting: items with PP remnants were rated lower than those with VP remnants when the matrix lacked a PP correlate (diff = 0.63) compared to the Compatible PP reference level (diff = –0.18), t = –2.66. In contrast, there was no significant penalty for Incompatible PP remnants over VP remnant counterparts (diff = 0.38). In planned paired t-test comparisons with Bonferroni corrections, PP Remnants were rated lower than VP Remnants for only the No Matrix PP condition in both by subjects [t1(29) = –3.67, p < 0.001] and by items [t2(23) = –2.77, p < 0.05] comparisons. In Incompatible PP conditions, PP Remnants were rated as less natural than VP Remnants in by-subjects comparisons only [t1(29) = –2.87, p < 0.01].

According to the Scalar Advantage Principle, PP sprouting should be preferred as a way to avoid computing ad hoc scales, since it relies on an entailment between the antecedent clause and the clause containing the ellipsis. In contrast, the Parallel Contrast Principle predicts that PP sprouting should be difficult to compute, as it violates the expectation for parallelism between clauses. In all, the results of the naturalness ratings study support the predictions of the Parallel Contrast Principle, as there was a penalty, not an advantage, for sprouted PP Remnants. Interestingly, we also found a hint of a weak, though not fully significant, penalty for Incompatible PP contrasts. The completion and corpus studies also suggested that PP sprouting would be dispreferred, since PP remnants were rare unless a PP was also present in the matrix clause. An eye tracking study will now be presented, in order to determine whether PP Sprouting is costly to compute during online processing.

### 2.3 Experiment 1C: Eye tracking study

#### 2.3.1 Materials and method

Items consisted of 24 items (22) from the naturalness experiment, half of which were followed by comprehension questions; see Appendix A for a complete list. Items were interspersed with another 48 sentences from two unrelated experiments and 18 non-experimental filler items. Prior to analysis, sentences were partitioned into 6 regions. As the No Matrix PP conditions (22a) did not contain a PP region or any other linguistic content in that portion of the matrix clause, the PP region was coded as empty (🚫) for analysis.

 (22) a. /John doesn’t want / to eat out/ 🚫 /, much less/{on Tuesday / go dancing}, / so I guess / we’ll be staying home. b. /John doesn’t want / to eat out on Saturday,/ much less /{on Tuesday / go dancing}, / so I guess / we’ll be staying home. c. /John doesn’t want / to eat out at a steakhouse,/ much less /{on Tuesday / go dancing}, / so I guess / we’ll be staying home.

The number of characters in PP remnants (M = 16.67, SE = 0.80) matched the VP remnants (M = 17.00, SE = 0.81) in a paired t-test, t(23) = –0.29, as did the distance in characters from the matrix PP in the Compatible PP (on Saturday; M = 15.83, SE = 1.09) and Incompatible PP (at a steakhouse; M = 14.92, SE = 0.75) conditions, t(23) = 0.85. Nevertheless, the statistical models of reading measures on the remnant that are reported below always included region length as a covariate.

Participants were instructed to read silently and at their own pace, and were given a short practice session to illustrate the procedure. The reader’s head was stabilized with a tower mount of an SR Research Eyelink 1000 eye tracker, which sampled eye movements from the right eye at 1000 Hz. Viewing was binocular. The display monitor was situated 55 cm away from the subject. All items were presented on a single line in 13-point fixed-width (proportional) Monaco font on a 21” LCD monitor using a Lenovo computer to display the sentences, so that three characters subtended approximately 1 degree of visual angle. Participants were calibrated before the experiment began with a three-point calibration system, and eye movement drift was corrected manually between each trial. Participants were encouraged to take breaks as often as they wished, and were calibrated if they moved away from the tower or if their fixations became unstable. A game pad was used to record responses to comprehension questions like (23) appearing after approximately half to the sentences. Comprehension questions contained either yes-no responses or simple forced-choice options. Access to the Internet was turned off on all computers, as were all non-essential programs.

 (23) Does John want to stay in instead of going out? a. Yes b. No

#### 2.3.2 Participants

Sixty UCLA undergraduates were recruited through the Psychology Pool for course credit. If a participant blinked on the remnant region in first pass reading on more than 3 trials for any one condition, she was removed from the data, and another participant was run in her place under the same counterbalancing list. Linguistic history was recorded for all subjects, all of whom self-reported as native speakers of English. All participants had normal or corrected to normal vision.

#### 2.3.3 Results

We report significant results for several standard eye tracking measures: first pass time, the sum of fixation durations from first entering a region until exiting to the left or right; go past time, the duration from first entering a region to first exiting to the right; second pass time, the time spent re-reading a region after having gone past the region previously; and total time, the sum of all fixations in a region at any point in reading. We also report the percentage of regressions out of a region.

All statistical models were given the same deviation coding and random effects structures as the naturalness ratings experiment. Following standard practice, models of continuous measures used the Gaussian distribution, whereas models of binomial data were modeled as logistic linear regressions. Prior to analysis, outliers from first fixation and first pass distributions were censored using winsorsation, so that the scores below the 5th percentile and above the 95th percentile are replaced with the score at the 5th and 95th percentile, respectively (Dixon 1960; Tukey 1962). Outliers in go past and total time measures were identified visually and removed, resulting in less than 1% data loss per measure. As mentioned, remnant length was included as a predictor in models of the remnant region. Several models were created for each measure and region, increasing the complexity of the fixed effects factors by adding in trial position (the sequence in the overall experiment) as an additive and then interactive predictor. In the case of go past times, trial position was approximated in terms of first and second halves of the experiment, which resulted in a better model fit. We report the best fitting model defined as the one with the significantly lowest AIC (Akaike 1974) or, in the case of equivalent models, the one with the fewest fixed effect parameters.

Only the remnant and the region following are reported for first fixation, first pass, go past times, and regressions out; see Table 4. All regions are reported for regressions in, second pass re-reading, and total times; see Table 5. The linear mixed effects models are presented in Tables 6 and 7. Results are presented in terms of three theoretically significant effects: a PP advantage, a PP Sprouting penalty, and an interaction between remnant type and sprouting. All significant effects are reported.

Contrast Remnant Region Region

Remnant Spill over Remnant Spill over

First fixation durations First pass times

No Matrix PP PP Remnant 216 (4) 226 (4) 424 (12) 226 (4)
VP Remnant 216 (4) 229 (4) 476 (17) 229 (4)
Compatible PP PP Remnant 222 (4) 228 (3) 426 (13) 228 (3)
VP Remnant 219 (4) 225 (4) 480 (16) 225 (4)
Incompatible PP PP Remnant 223 (4) 227 (4) 438 (13) 227 (4)
VP Remnant 232 (5) 226 (4) 502 (15) 226 (4)
Go past times Regressions out

No Matrix PP PP Remnant 584 (23) 620 (26) 20% (3) 4% (1)
VP Remnant 592 (25) 572 (25) 14% (2) 4% (1)
Compatible PP PP Remnant 488 (18) 558 (24) 9% (2) 1% (1)
VP Remnant 607 (25) 567 (24) 17% (3) 3% (1)
Incompatible PP PP Remnant 558 (24) 593 (25) 17% (3) 4% (1)
VP Remnant 623 (25) 565 (24) 14% (2) 3% (1)

Table 4

Experiment 1C: Eye tracking Means and standard errors for first fixation durations, first pass times, go past times, regressions out on the Remnant and Spill over regions.

Contrast Remnant Regressions in

Subject PP region Much less Remnant Spill over Final

No Matrix PP PP Remnant 17% (3) 23% (3) 6% (2) 30% (3)
VP Remnant 17% (3) 15% (3) 4% (1) 31% (3)
Compatible PP PP Remnant 26% (3) 8% (2) 11% (2) 4% (1) 29% (3)
VP Remnant 24% (3) 6% (2) 18% (3) 5% (2) 31% (3)
Incompatible PP PP Remnant 23% (3) 8% (2) 18% (3) 8% (2) 32% (3)
VP Remnant 26% (3) 12% (2) 17% (3) 5% (2) 24% (3)
Second pass times

No Matrix PP PP Remnant 113 (20) 88 (11) 60 (12) 133 (18)
VP Remnant 87 (14) 63 (10) 41 (9) 113 (15)
Compatible PP PP Remnant 167 (28) 45 (9) 44 (9) 39 (9) 120 (16)
VP Remnant 150 (22) 46 (8) 55 (8) 45 (10) 113 (14)
Incompatible PP PP Remnant 162 (27) 57 (11) 60 (9) 49 (10) 130 (16)
VP Remnant 133 (21) 67 (11) 55 (8) 58 (11) 102 (16)
Total times

No Matrix PP PP Remnant 1305 (37) 402 (16) 568 (22) 710 (28) 609 (25)
VP Remnant 1243 (38) 381 (14) 586 (22) 649 (26) 649 (25)
Compatible PP PP Remnant 1320 (38) 529 (21) 364 (13) 501 (21) 654 (26) 625 (25)
VP Remnant 1296 (40) 549 (23) 380 (13) 595 (25) 658 (26) 620 (24)
Incompatible PP PP Remnant 1351 (41) 513 (22) 364 (12) 549 (20) 680 (26) 620 (25)
VP Remnant 1314 (39) 523 (21) 365 (13) 629 (25) 637 (25) 585 (21)

Table 5

Experiment 1C: Eye tracking. Means and standard errors for regressions in, second pass, and total times for all regions.

Region Parameters First fixation durations First pass times

Estimate Std. Error t-value Estimate Std. Error t-value

Remnant (Intercept) 248.64 11.15 22.3* 111.38 29.13 3.82*
PP Remnant –1.62 2.26 –0.72   –23.26 4.60 –5.06*
Incompatible PP 7.15 3.19 2.24* 12.96 6.47 2.00*
No Matrix PP –6.23 3.19 –1.95+ –6.60 6.48 –1.02
Length –1.25 0.60 –2.10* 20.44 1.51 13.54*
PP Remnant × Incompatible PP –4.52 3.19 –1.42   –5.49 6.48 –0.85
PP Remnant × No Matrix PP 1.72 3.20 0.54   2.85 6.48 0.44
Spill over (Intercept) 231.96 5.09 45.57* 231.96 5.09 45.57*
PP Remnant –0.19 1.98 –0.09   –0.19 1.98 –0.09
Incompatible PP 0.20 2.80 0.07   0.20 2.80 0.07
No Matrix PP 0.16 2.80 0.06   0.16 2.80 0.06
PP Remnant × Incompatible PP 1.25 2.81 0.45   1.25 2.81 0.45
PP Remnant × No Matrix PP –3.04 2.81 –1.08   –3.04 2.81 –1.08
Parameters Go past times Regressions out

Estimate Std. Error t–value Estimate Std. Error Wald Z

Remnant (Intercept) 178.52 48.52 3.68* –1.02 0.43 –2.35*
PP Remnant –24.27 8.22 –2.95* 0.06 0.12 0.50
Incompatible PP 16.06 11.57 1.39   0.16 0.12 1.38
No Matrix PP 12.17 11.59 1.05   –0.01 0.09 –0.07
Length 23.44 2.58 9.09* –0.06 0.02 –2.58*
PP Remnant × Incompatible PP –0.88 11.58 –0.08   0.13 0.12 1.11
Spill over PP Remnant × No Matrix PP 27.65 11.60 2.38* 0.25 0.12 2.16*
(Intercept) 572.57 40.87 14.01* –3.61 0.29 –12.33*
PP Remnant 6.29 7.81 0.81   0.10 0.24 0.43
Incompatible PP 0.38 11.03 0.03   0.29 0.23 1.25
No Matrix PP 12.53 11.03 1.14   –0.03 0.17 –0.17
PP Remnant × Incompatible PP 2.07 11.07 0.19   0.33 0.24 1.36
PP Remnant × No Matrix PP 9.69 11.05 0.88   0.01 0.23 0.03

Table 6

Experiment 1C: Eye tracking. Linear mixed effects regression models for the remnant and spill over regions for first fixation durations, first pass times, go past times, and percentage of regressions out.

Region Parameters Regressions in Total times

Estimate Std. Error Wald Z Estimate Std. Error t-value

Subject (Intercept) –1.54 0.18 –8.69* 1308.8 62.71 20.87*
PP Remnant –0.01 0.07 –0.14   17.44 11.89 1.47
Incompatible PP 0.15 0.10 1.49   26.15 16.82 1.55
No Matrix PP –0.34 0.11 –3.12* –27.45 16.79 –1.64
PP Remnant × Incompatible PP –0.09 0.10 –0.82   3.35 16.84 0.20
PP Remnant × No Matrix PP 0.01 0.11 0.11   3.19 16.80 0.19

PP Region (Intercept) –2.48 0.62 –4.00* 98.32 42.97 2.29*
PP Remnant –0.10 0.14 –0.72   –5.80 8.58 –0.68
Incompatible PP –0.19 0.14 –1.37   7.47 8.69 0.86
Length –0.02 0.04 –0.59   28.21 2.50 11.28*
PP Remnant × Incompatible PP 0.18 0.14 1.34   4.04 8.57 0.47

Much less (Intercept) –1.89 0.17 –11.4* 374.67 13.54 27.68*
PP Remnant 0.06 0.11 0.52   1.20 4.86 0.25
Incompatible PP 0.14 0.11 1.21   –11.02 6.88 –1.60
No Matrix PP 0.02 0.08 0.25   14.47 6.87 2.11*
PP Remnant × Incompatible PP 0.02 0.11 0.15   1.43 6.88 0.21
PP Remnant × No Matrix PP 0.28 0.11 2.50* 9.95 6.88 1.45

Remnant (Intercept) –3.09 0.21 –14.82* 116.00 46.98 2.47*
PP Remnant 0.10 0.13 0.77   –24.73 7.41 –3.34*
Incompatible PP 0.22 0.18 1.24   17.21 10.43 1.65
No Matrix PP –0.12 0.20 –0.60   6.43 10.43 0.62
Length —   26.98 2.42 11.14*
PP Remnant × Incompatible PP 0.13 0.18 0.75   –8.75 10.44 –0.84
PP Remnant × No Matrix PP 0.15 0.19 0.79   23.38 10.44 2.24*

Spill over (Intercept) –1.10 0.23 –4.85* 664.8 47.28 14.06*
PP Remnant 0.06 0.07 0.79   13.37 7.91 1.69
Incompatible PP –0.09 0.10 –0.88   –5.93 11.19 –0.53
No Matrix PP 0.06 0.10 0.59   12.03 11.18 1.08
PP Remnant × Incompatible PP 0.19 0.10 1.92+ 2.20 11.21 0.20
PP Remnant × No Matrix PP –0.06 0.10 –0.62   12.33 11.19 1.10

Final (Intercept) —   612.5 35.75 17.13*
PP Remnant —   0.63 7.47 0.08
Incompatible PP —   –12.27 10.55 –1.16
No Matrix PP —   12.50 10.55 1.18
PP Remnant × Incompatible PP —   12.54 10.57 1.19
PP Remnant × No Matrix PP —   –17.11 10.56 –1.62

Table 7

Experiment 1C: Eye tracking. Linear mixed effect regression models for regressions in and total times.

Compatible with the results from the norming studies, an advantage for PP remnants over VP remnants appeared on the remnant region for first pass (a 57 ms advantage), go past (a 64 ms advantage), and total times (also 64 ms advantage); see the top left panel of Figure 2. We propose that PP remnants were faster in these measures because the matrix PP presents a highly salient constituent to contrast with the remnant as implied by the completion norming study. Given the early, and persistent, advantage for PP remnants, the processor might have initially anticipated a PP remnant upon encountering the much less coordinator. We take the PP advantage as the statistical reference level against which to compare the case of Sprouting and Incompatible PPs.

Figure 2

Experiment 1C. Top left panel: PP Remnant advantage on the remnant region for first pass, go past, and total time measures. Remaining panels: Interaction between Remnant type and Matrix clause structure showing that the advantage for PP remnants is reduced or overturned in No Matrix PP conditions.

#### 2.3.3.2 PP Sprouting penalty

Crucially, the predicted exception to the advantage for PP remnants occurred when there was no PP in the matrix clause for the remnant to contrast with, i.e., just in the case of PP Sprouting, shown in Figure 2. The advantage for PP remnants was either eliminated (as in go past and total times) or else reversed (manifesting in increased regressions out of a region) when a PP remnant had to be paired with a correlate sprouted from the matrix clause. Although there was a PP advantage in go past times on the remnant region: a 119 ms advantage for Compatible PP remnants and a 65 ms advantage for Incompatible PP remnants, the pattern reversed in No Matrix PP conditions in an 8 ms PP penalty, t = 2.38. Similarly, in total times on the remnant region, a 94 ms advantage for Compatible PP remnants and an 80 ms advantage for Incompatible PP remnants was reversed in an 18 ms penalty for PP remnants compared to VP remnants in the No Matrix PP condition, t = 2.24. In keeping with the predicted penalty for Sprouting, the percentage of regressions out of the remnant was modulated by the presence of a matrix PP: although regressions increased when the remnant was a VP in Compatible PP conditions by 8%, there was a 6% increase in regressions on PP remnants in the No Matrix PP condition, z = 2.16, p < 0.05.

#### 2.3.3.3 Incompatible PP penalty

A model with the first and second halves of the experiment included as an interactive predictor provided a better fit of the go past data on the remnant than other models computed. In this model, the interaction between Remnant type and the Incompatible PP condition was significantly reduced in the second half (13 ms penalty for Incompatible PP remnants over VP remnants) compared to the first half of the experiment (a 73 ms penalty for Incompatible conditions), t = –2.66, suggesting that participants might have either begun broadening the contextual dimensions along which the contrasts were to be compared, or else adopted a different reading strategy, allowing them to progress through the sentences more quickly. There were no other indications of processing costs associated with Incompatible PPs.

#### 2.3.4 Discussion

The results indicate three effects of primary interest. First, the processor appeared to encounter less processing difficulty on PP remnants than on VP remnants, suggesting an overall preference to form a contrast with the immediately preceding PP rather than the matrix VP that contained it. This interpretation is supported by the results of the completion norming study, in which PP contrasts were in general supplied at a much greater rate than any other contrast. Second, the PP advantage failed to hold in the case of PP Sprouting, where PP remnants elicited a processing cost in multiple measures (go past, total times, and regressions out) immediately on the remnant region itself. As predicted by the Parallel Contrast Principle, parallelism between the matrix and ellipsis clause appeared to facilitate processing, even though the conditions with PPs in the first clause required the processor to form an ad hoc scale between the remnant and its correlate.

Third, PP remnants that formed a semantically incompatible relation with their PP correlate were penalized in go past times on the remnant, but only in the first half of the experiment. This pattern suggests that participants were initially sensitive to the meaning incongruence of PPs relating to different aspects of the situation. Later in the study, they appeared to adopt a reading strategy of ignoring this minor mismatch, most likely due to repeated exposure to examples as the experiment wore on. This is an important conceptual control, as it confirms that subjects were attuned to the implied relationships between correlate-remnant pairs, rather than simply finding a correlate for the remnant supplied without consideration of its meaning. If readers were taking the Incompatible PPs to be part of the antecedent clause that would need to be copied, but not contrasting directly with the remnant PPs, then this condition should have patterned with the No Matrix PP condition: both would involve no contrasting correlate at all for the remnant PP. The slight dispreference instead suggests that participants did consider the PPs in different clauses to be potential correlates, albeit not as parallel as they might have liked, and that they adjusted over the course of the experiment to the mismatch in semantic content.

However, as all of the sentences were presented without context, it’s possible that the penalty attributed to PP sprouting is due instead to the introduction of a new referential DP contained within the PP remnant. This possibility is addressed in the following eye tracking experiment by adding preceding contexts that either mentioned the PP, thus making the remnant more accessible in the discourse, or did not. If violating parallelism is the central reason behind the penalty for PP sprouting in the studies above, then sprouting should continue to be costly, regardless of context.

## 3 Experiment 2: PP sprouting in context

### 3.1 Experiment 2A: Completion study with context

#### 3.1.1 Materials and method

This norming experiment had the same task as Experiment 1A, except that the target sentence fragments were preceded by contexts that either supported a PP remnant (24a) or were neutral (24b). As before, we manipulated whether there was a PP in the matrix clause (outside the state) that could serve as a possible correlate.

 (24) a. Supporting context: It surprised his friends that Oliver was about to take a long vacation outside the country. b. Neutral context: It surprised his friends that Oliver was showing a new interest in traveling. Target sentence: He hadn’t traveled (outside the state), much less _________.

#### 3.1.2 Participants

Forty-two participants completed a completion norming study for course credit over the Internet. Five participants identified as non-native speakers of English and were removed from analysis, as were six participants who answered highly predictable catch items incorrectly. Three more participants were removed for counterbalancing purposes, leaving 28 participants in the final dataset, who contributed 560 completions in total.

#### 3.1.3 Results and discussion

Twenty-two ambiguous, unclear, or non-sensical completions were removed. In the remaining completions, sprouting of any category appeared in 12% of cases, of which 60% were PP remnants. When the target contained a PP correlate in the matrix clause, there was only one instance of sprouting of any kind. However, when the target did not contain a PP correlate, there were more PP sprouting completions, regardless of context (0% vs. 15%). There was also an interaction, so that supporting contexts produced nearly twice the number of PP sprouting completions when there was no PP correlate (11% vs. 19%) compared to neutral contexts. The completion study indicates that contexts meant to induce or facilitate PP sprouting indeed resulted in more PP sprouting, even though sprouting was still avoided in general.

### 3.2 Experiment 2B: PP sprouting in context

#### 3.2.1 Materials and methods

Twenty quartets like (25) were created in a design that crossed Context (Supporting, Neutral) and PP contrast (Matrix PP, No Matrix PP). In all items, the target sentences derived from sentences in the first eye tracking experiment. A complete list of contexts and items appears in Appendix B. There were no incompatible PPs in this study, so all correlate PPs were on the same semantic scale as the remnants, and there were no VP remnants. The context conditions were minimally different from each other, varying mostly in whether the remnant in the target sentence was overtly mentioned: it appeared in the Supporting contexts but not the Neutral ones.6 Materials were presented on two lines, so that the target sentence always appeared on its own line. Regions used in analysis of the target sentence are demarcated with a slash (/) symbol.

 (25) a. Supporting context: It surprised his friends that Oliver was about to take a long vacation outside the country. b. Neutral context: It surprised his friends that Oliver was showing a new interest in traveling. Target sentence: He hadn’t traveled / (outside the state), / much less / outside the country, / until he met / his wife.

Items were interspersed with 66 items from unrelated experiments, and 20 non-experimental fillers, for a total of 106 items per experimental session. Comprehension questions like (26) appeared after approximately half of the materials.

 (26) Who was surprised at Oliver’s interest in travel? a. His friends b. His family

#### 3.2.2 Participants

Fifty-six subjects participated in the experiment, using the same recruitment and exclusion criteria described in the previous eye tracking experiment (Experiment 1C).

#### 3.2.3 Results

The data cleaning and analysis procedure from Experiment 1C were used in the present experiment, except that conditions were sum coded so that Supporting context and the Matrix PP conditions were treated as the statistical reference levels. Means and standard errors are reported in Tables 8 and 9. The effects are presented in terms of three theoretically significant effects: the effects of context, PP Sprouting, and their interaction. Reading behavior on the context sentence was not examined. As before, only the remnant and the immediately following spill over region are reported for first fixation durations, first pass times, go past times, and regressions out; see Table 8 for means and Table 10 for statistical models. All regions are reported for measures involving re-reading of a region, regressions in, second pass, and total times; see Table 9 for means and Tables 1112 for statistical models. All significant effects are reported.

Context Matrix PP Remnant Spill over Remnant Spill over

First fixation First pass

Supporting Matrix PP 216 (5) 228 (5) 340 (14) 465 (21)
No Matrix PP 217 (4) 228 (5) 352 (14) 433 (18)

Neutral Matrix PP 222 (5) 229 (5) 346 (13) 438 (17)
No Matrix PP 218 (5) 231 (5) 381 (15) 475 (19)
Go past Regressions out

Supporting Matrix PP 415 (21) 531 (31) 12% (3) 6% (2)
No Matrix PP 433 (22) 480 (26) 16% (3) 5% (2)

Neutral Matrix PP 433 (20) 506 (31) 15% (3) 3% (1)
No Matrix PP 487 (25) 563 (33) 18% (3) 8% (2)

Table 8

Experiment 2B: Eye tracking in context. Means and standard errors for first fixation durations, first pass times, go past times, regressions out.

Context Matrix PP Subject PP region Much less Remnant Spill over Final

Regressions in

Supporting Matrix PP 13% (3) 11% (3) 15% (3) 1% (1) 34% (4)
Supporting No Matrix PP 11% (3) 19% (4) 6% (2) 34% (4)
Neutral Matrix PP 18% (3) 8% (2) 17% (3) 2% (1) 32% (4)
Neutral No Matrix PP 13% (3) 20% (4) 5% (2) 29% (4)
Second pass times

Supporting Matrix PP 110 (31) 50 (12) 41 (11) 17 (6) 124 (19)
Supporting No Matrix PP 96 (27) 62 (15) 53 (14) 126 (17)
Neutral Matrix PP 118 (22) 41 (10) 52 (10) 35 (10) 107 (17)
Neutral No Matrix PP 123 (27) 77 (15) 49 (13) 110 (20)
Total times

Supporting Matrix PP 1149 (47) 456 (21) 368 (17) 385 (19) 607 (32) 498 (26)
Supporting No Matrix PP 1095 (45) 351 (20) 439 (22) 567 (28) 484 (31)
Neutral Matrix PP 1102 (51) 459 (21) 353 (15) 431 (21) 544 (25) 488 (24)
Neutral No Matrix PP 1106 (48) 371 (22) 496 (31) 634 (36) 541 (34)

Table 9

Experiment 2B: Eye tracking in context. Means and standard errors for regressions in, second pass times, total times.

Parameters First fixation durations First pass times

Estimate Std. Error t-value Estimate Std. Error t-value

(Intercept) 218.02 3.52 62.01* 349.77 19.19 18.23*
Neutral –2.41 2.25 –1.07   –10.53 5.53 –1.91+
No Matrix PP –0.35 2.24 –0.16   11.61 5.52 2.10*
Neutral × No Matrix PP 1.78 2.27 0.78   –1.28 5.60 –0.23
(Intercept) 228.98 4.14 55.33* 433.06 26.88 16.11*
Neutral –1.56 2.22 –0.70   –2.90 7.10 –0.41
No Matrix PP 1.17 2.23 0.53   5.36 7.15 0.75
Neutral × No Matrix PP 0.18 2.25 0.08+ –11.81 7.19 –1.64
Go past Regressions out

Estimate Std. Error t–value Estimate Std. Error Wald Z

(Intercept) 438.32 21.46 20.43* –2.02 0.22 –9.04*
Neutral 18.22 10.43 1.75+ 0.10 0.14 0.75
No Matrix PP 15.85 10.41 1.52   0.12 0.14 0.90
Neutral × No Matrix PP 7.35 10.50 0.70   –0.02 0.14 –0.13
(Intercept) 505.45 33.58 15.05* –3.81 0.55 –6.89*
Neutral 12.13 13.55 0.90   –0.10 0.24 –0.42
No Matrix PP 5.49 13.63 0.40   0.27 0.24 1.11
Neutral × No Matrix PP 21.54 13.68 1.57   0.35 0.24 1.46

Table 10

Experiment 2B: Eye tracking in context. Linear mixed effects models for first fixation durations, first pass times, go past, and regressions out.

Region Parameter Regressions in Total times

Estimate Std. Error Wald Z Estimate Std. Error t-value

Subject (Intercept) –2.08 0.21 –9.83* 1116.33 69.34 16.10*
Neutral 0.15 0.14 1.11   4.02 18.94 0.21
No Matrix PP –0.18 0.14 –1.30   10.01 18.88 0.53
Neutral × No Matrix PP –0.07 0.14 –0.48   9.02 19.09 0.47

Matrix PP (Intercept) –2.24 0.22 –10.42* 450.21 23.86 18.87*
Neutral –0.14 0.22 –0.63   3.12 14.05 0.22

Much less (Intercept) –1.83 0.23 –7.85* 357.01 13.51 26.42
Neutral 0.06 0.13 0.44   3.30 8.96 0.37
No Matrix PP 0.15 0.13 1.11   3.47 8.93 0.39
Neutral × No Matrix PP –0.03 0.13 –0.20   7.00 8.99 0.78

Remnant (Intercept) –3.61 0.40 –9.10* 433.17 25.34 17.09*
Neutral 0.20 0.32 0.61   26.41 11.06 2.39*
No Matrix PP 0.74 0.32 2.31* 29.41 11.04 2.66*
Neutral × No Matrix PP –0.3 0.32 –0.92   –1.00 11.13 –0.09

Spill over (Intercept) –0.81 0.15 –5.41* 571.77 39.06 14.64*
Neutral –0.09 0.10 –0.86   –2.91 12.74 –0.23
PP Sprouting –0.05 0.10 –0.49   15.12 12.74 1.19
Neutral × No Matrix PP –0.04 0.10 –0.34   22.91 12.83 1.79+

Final (Intercept) —   492.66 36.16 13.62*
Neutral —   14.97 12.09 1.24
No Matrix PP —   13.01 12.10 1.08
Neutral × No Matrix PP —   12.05 12.2 0.99

Table 11

Experiment 2B: Eye tracking in context. Linear mixed effects regression models for regressions in and total times.

Region Effect By-subjects By-items

F-value p-value F-value p-value

Subject Context 0.73 0.40 0.44 0.52
Matrix PP 0.22 0.64 0.00 0.98
Context × Matrix PP 0.00 0.96 0.02 0.88
Matrix PP Context 0.17 0.68 0.18 0.67

Much less Context 0.05 0.83 1.16 0.30
Matrix PP 2.21 0.14 2.75 0.11
Context × Matrix PP 1.35 0.25 0.06 0.80

Remnant Context 0.03 0.86 0.28 0.61
Matrix PP 4.49 <0.05 4.35 <0.05
Context × Matrix PP 2.54 0.12 0.93 0.35

Spill over Context 1.16 0.29 0.37 0.55
Matrix PP 0.15 0.70 0.22 0.65
Context × Matrix PP 0.09 0.77 0.14 0.71

Table 12

Experiment 2B: Eye tracking in context. By-subjects and by-items ANOVAs for second pass times.

#### 3.2.3.1 Context effects

There was a marginal 18 ms penalty for Neutral contexts in first pass times on the remnant, t = 1.83, p = 0.06. First pass times are sometimes divided into trials where the reader has elected to regress back in text from those where she has elected to move forward, as they may represent distinct processing strategies when progressing through text (Altmann, Garnham & Dennis 1992; Rayner & Sereno 1994). Upon reaching a difficult portion of text, the reader may decide to return to previous regions, perhaps to resolve an ambiguity or to stall for more time (e.g., Mitchell et al. 2008). An alternative strategy in such cases is to continue to move forward, perhaps in hopes of finding useful information later in the sentence. Once trials with first pass regressions out of the remnant region were eliminated (leaving 34% of the observations from the total first pass data), there was a 44 ms cost for Neutral contexts, t = 2.69, which was moderated by an interaction, described in 3.2.3.3 below. A 51 ms penalty for Neutral contexts in total times, t = 2.39, was also observed. The cost for Neutral contexts indicates that subjects were attuned to the preceding sentence at the point of the remnant and that the Supporting contexts did successfully support the target sentences; see the left panel of Figure 3.

Figure 3

Experiment 2B. Left panel: Elongated reading times for Neutral compared to Supporting contexts in first pass and total time measures in the remnant region. Right panel: Reading time penalty for PP Sprouting conditions in first pass, second pass, and total time measures on the remnant.

#### 3.2.3.2 PP sprouting cost

As in the previous experiment, sprouting in the No Matrix PP conditions was found to be costly immediately on the remnant region, regardless of preceding context, as shown in the right panel of Figure 3. In first pass times, readers spent 24 ms longer in No Matrix PP conditions, t = 2.10, indicating an early cost for PP sprouting. However, the cost for sprouting manifested primarily in measures of re-reading. Readers spent nearly twice as long in second pass re-reading measures on the remnant in the No Matrix PP condition (M = 51, SE = 10) compared to the Matrix PP (M = 26, SE = 6) condition in by-subjects ANOVAs, but not in by-items analyses. Further, PP Sprouting (M = 464, SE = 19) conditions elicited longer total reading times on the remnant compared to PP Matrix (M = 413, SE = 14) conditions, t = 2.66. Finally, readers made more regressions into the remnant region in the No Matrix PP condition (M = 6%, SE = 1) compared to the PP Matrix (M = 2%, SE = 1) condition, z = 2.73. The results indicate that readers encountered immediate and sustained difficulty when they were presented with a PP remnant but no corresponding correlate, forcing them to sprout a PP.

#### 3.2.3.3 PP sprouting in context

While not central to the main hypotheses, we expected that Supporting contexts would facilitate recovery from any difficulty due to sprouting a PP argument. No interactions were observed in first fixation or first pass times. However, first pass times in which regressions out were eliminated indeed showed an interaction $\left[\stackrel{^}{\beta }=-37.26,SE=9.41,t=-3.96,p<0.001\right]$, in which Supporting contexts reduced reading times on the remnant in No Matrix PP conditions by 128 ms (p’s < 0.05 in by-subject and by-items t-tests), but had no effect on Matrix PP controls, as shown in the left panel in Figure 4. No interaction was observed for first pass times followed by regressions out of the remnant, where there was only a general facilitation for Supporting contexts $\left[\stackrel{^}{\beta }\text{\hspace{0.17em}}=\text{\hspace{0.17em}}-20.41,\text{\hspace{0.17em}}SE=10.04,\text{\hspace{0.17em}}t=-2.03,\text{\hspace{0.17em}}p<0.05\right]$; see the right panel in Figure 4. The differential facilitation for PP Sprouting in Supporting contexts was therefore observed only when the reader made a forward progression through the sentence, a pattern which is perhaps related to how eager readers were to integrate the preceding context into the sentence on a particular trial.

Figure 4

Experiment 2B. Regression contingent analyses of first pass times on the remnant shown as centered z-scores. Left panel: First pass times in trials with regressions out of the remnant region showed a differential effect of context on No Matrix PP conditions, but no effect on Matrix PP conditions. Right panel: First pass times in trials without regressions out of the remnant region showed an advantage for Supporting contexts.

Other measures showed weak evidence in favor of a reduced PP sprouting penalty in Supporting contexts. Although not the best fitting model, an interaction consistent with a PP sprouting penalty was observed in go past times on the post remnant spill over region once trial order was included as an interactive predictor; there was a 57 ms cost for the No Matrix PP condition in the Neutral contexts, but a 51 ms advantage for PP Sprouting in Supporting contexts $\left[\stackrel{^}{\beta }=66.36,SE=29.67,t=2.24\right]$, along with a trend indicating that the interaction reduced over the course of the experiment, t = –1.69. Finally, there was a non-significant trend for an interaction between Sprouting and Context for total times in the spill over region: While there was a 90 ms cost for PP sprouting in Neutral contexts, there was a 40 ms benefit for PP sprouting in Supporting contexts, t = 1.79.

#### 3.2.4 Discussion

The results confirm the findings from the previous eye tracking experiment: there was a reading time penalty for PP remnants that lacked an overt correlate, as predicted by the Parallel Contrast Principle. Although the cost for PP sprouting appeared in a variety of measures regardless of the context, it was reduced in first pass times in trials without regressions out from the region, as well as marginally reduced in go past times on the remnant, and in total times on the spill over region. The overall pattern thus suggests that readers were sensitive to information from the context, but that context was not sufficient to completely override the cost of sprouting a correlate for the remnant. The results support the claim that the processor relies on parallelism to pair a remnant and a correlate, rather than abandoning the search for a correlate and attempting to compute an entailment relation between clauses whenever possible. The alternative hypothesis according to which the apparent sprouting penalty in Experiment 1C was solely driven by accommodating a new referent is not supported by these results.

## 4 General discussion

Two norming studies presented initial evidence that PP sprouting is dispreferred in focus-sensitive coordination. In the no-context completion study, participants very rarely supplied a PP as a remnant after the much less coordinator, unless there was a PP present in the preceding clause. This completion study also indicated that non-sprouting PP contrast in focus-sensitive coordination is quite acceptable, as sentences with a PP in the initial clause were completed with PPs in the remnant 80% of the time. In the rating study, conditions with sprouted PPs were rated lower than those with paired contrasting PPs or those with VP remnants.

In two eye tracking while reading studies, we found that PP sprouting also interferes with the on-line processing of focus-sensitive coordination sentences. In the first eye tracking study, a processing advantage for PP remnants over VP remnants was reversed in the case of PP sprouting. As long as a PP was present in the initial clause, even if the PPs evoked a different scalar dimension, the PP remnant was comparatively easy to process. Sentences in which the PP correlate and the PP were incompatible showed later delays in processing only during the first half of the experiment, after which participants apparently habituated to the mismatching scalar dimensions. In a second eye tracking study, where supportive or neutral contexts preceded target sentences with or without a correlate PP, sprouting conditions still elicited slower reading on the remnant and increased re-reading compared to conditions with an overt correlate in the matrix clause. Even though supporting contexts facilitated recovery from the PP sprouting penalty, they did not eliminate the independent penalty for violating parallelism in sprouting.

These results all support the Parallel Contrast Principle over the Scalar Advantage Principle. Whether off-line or on-line, processing focus-sensitive coordination structures is facilitated when an overt PP correlate is present in the initial clause, even though the processor must also construct an ad hoc scale on which to compare the correlate and the remnant. This puts the processing of focus-sensitive coordination structures on a par with sluicing and other ellipsis structures in preferring parallelism between the remnant and correlate (Chung et al. 1995; Frazier & Clifton 1998; Carlson 2001, 2002). These results further dovetail with the research on zero-adjective contrasts in focus-sensitive coordination (Carlson & Harris 2018), in which DP remnants with adjectives elicited processing costs when the correlate DP did not contain a contrasting adjective (e.g., The chef didn’t overcook a (complex) meal, let alone an easy one).

These results relate to an important debate about the status of parallelism in sentence processing, and whether it is specific to ellipsis or conjunction or both. Prior research has routinely found that parallelism of different types eases and speeds up the processing of different types of ellipsis, including sluicing (Frazier & Clifton 1998; Carlson et al. 2009), gapping (Carlson 2001, 2002), and stripping/bare argument ellipsis (Paterson et al. 2007; Stolterfoht et al. 2007; Carlson 2013). Additional research has observed parallelism effects in conjoined structures (Frazier et al. 1984; 2000; Henstra 1996; Sturt et al. 2010). But some studies have found that parallelism facilitates processing in unconjoined and unelided structures, as well (e.g., Sturt et al. 2010; Dickey & Bunger 2011). It seems telling to us that so many ellipsis structures, which demand reuse or copying of earlier material, should turn out to be sensitive to additional similarities in the syntactic, semantic, and prosodic features of the clauses.

Focus-sensitive coordination structures, headed by conjunctions like much less and let alone, are an especially important addition to research on parallelism in sentence processing. They provide a case in which violating parallelism by sprouting could have conferred a processing advantage by removing the need to construct an ad hoc scale between the correlate and the remnant, a process that could arguably require additional resources to compute or be delayed in comprehension (Levinson 2000; Chierchia 2004; but see also Hirschberg 1985; Sperber & Wilson 1995; Katsos & Cummins 2012). Nevertheless, the studies above found that avoiding ad hoc scales does not ease processing, at least if it comes at the cost of violating parallelism between clauses.7

The construction of a scalar relation between clauses in focus-sensitive coordination is still important to the processing of these constructions, as indicated by the cost for Incompatible PP remnants observed in the first eye tracking experiment. However, we suggest that the computation of such scales is simply delayed until after a basic clausal meaning has been constructed. The importance of parallelism follows from the conceptual steps articulated in (10) above, in which the processor must locate an appropriate correlate for the remnant before an appropriate scale can be inferred. Indeed, a promising avenue of research in this vein would be to explore whether some scales are more readily accessible than others, and if so, whether they facilitate comprehension of coordination structures that require such scales during interpretation. For now, we believe that there is strong evidence that the processor prioritizes basic structure building processes when processing sentences with ellipsis, and that parallelism, which helps the processor build structure at the ellipsis site, is a particularly powerful component in recovering the intended meaning and structure, where there was once only silence.

Appendix

Appendices A and B contain the full set of experimental items for Experiments 1–2. DOI: https://doi.org/10.5334/gjgl.707.s1

## Notes

1The type of parallelism studied in this project involves similarities between remnants and correlates, and is not the type of parallelism that relates to the syntactic and/or semantic identity condition allowing ellipsis (e.g., Merchant 2001, 2008; Takahashi & Fox 2005; Griffiths & Liptak 2014). We do not intend to enter into the debate about the presence and size of structure within an ellipsis site. See section 1.2 for more general discussion of parallelism.

2While most varieties of English license focus-sensitive coordination in the presence of explicit or implicit negation, some dialects permit a positive variant, e.g., I can swim, let alone float. This variant tends to either reverse the scalar relationship (see Mark Liberman’s commentary on Language Log, November 21, 2007, accessible as http://itre.cis.upenn.edu/~myl/languagelog/archives/005142.html and comments in Toosarvandani 2010), or else abandon the scalar component altogether, similar to an afterthought along the lines of not to mention (Cappelle, Dugas & Tobin 2015). We concentrate exclusively on the majority dialect here, in which there is a strong scalar component to its interpretation.

3Although it is conceptually possible that the processor forgoes retrieving the correlate in step 2, and simply posits a parallel structure at the ellipsis site, we think this is unlikely given evidence for similarity-based interference, characteristic of retrieval systems, from non-correlate distractor nouns in sluicing (Harris 2015, 2019).

4A reviewer suggests re-describing the entailment relation in terms of a set inclusion between properties denoted by the verb phrases. We continue to follow previous research (Fillmore et al 1988; Toosarvandani 2010) in describing the scale in terms of entailment between propositions for several reasons. First, propositions standardly denote sets of worlds, in which entailment between propositions p ⊨ q can be equivalently stated in terms of inclusion between the set of worlds that make p true and the set of worlds that make q true. Second, and more importantly, there are instances of focus-sensitive coordination in which subjects contrast (John didn’t laugh, let alone Mary), which could not be captured by set inclusion of the verb phrases. Still, the use of entailment is intended to be descriptive, and other, perhaps more general, semantic characterizations could be explored in the future.

5Half of the items contained an additional DP in the matrix clause, e.g., books in Melinda doesn’t read books (for pleasure/at work), which provided a suitable correlate for DP remnants. As shown in Table 1, participants sometimes provided DP completions, but these do not constitute DP sprouting when it contrasted with a DP correlate in the matrix clause.

6Three of the contexts differed from the others. For items 1–3, the Supporting contexts included both the correlate and remnant PPs instead of only the remnant PP. For item 3, the Neutral context also included the remnant PP. The rest of the contexts matched the description above.

7In determining whether SAP is independently motivated, a reviewer raised an interesting contrast between a structure with an ad hoc scale (i.a), in which studying chemistry is harder or less expected than studying carpentry, and a structure conveying entailment between clauses (i.b), as in not studying anything entails not studying chemistry. Parallelism, in our sense, is satisfied in both sentences. It is possible that with parallelism held constant, a preference for SAP would emerge. Our intuitions are that the ease of interpretation depends on whether the context licenses the antecedent clause in (i.b), which sounds, to our ears, odd without additional context, such as The students were exhausted during finals week.

 (i) a. Michael couldn’t study carpentry, much less chemistry. b. Michael couldn’t study anything, much less chemistry.

We found similar cases in COCA, with any+N as a correlate for more specific DP remnants, especially with adjectival contrasts (see discussion in Carlson & Harris 2018); the examples in (ii) illustrate.

 (ii) a. “I couldn’t picture my Grandma as someone responsible for the death of anything, much less her best friend at the age of 16.” b. “But the idea that Susan owed anything to anyone – much less her cousin’s new husband – was intolerable.”

At any rate, the motivation for SAP remains conceptual, and we believe the finding that a conceptually plausible benefit for computing ready-made relations does not outweigh the general preference for parallel structures reveals the strength of parallelism biases during sentence processing.

## Abbreviations

AIC = Akaike information criterion, cm = centimeters, COCA = Corpus of Contemporary American English, CP = complementizer phrase, diff = difference, FocP = focus phrase, Hz = Hertz, LCD = liquid crystal display, LF = Logical Form, M = mean, ms = milliseconds, NP = noun phrase, PCP = Parallel Contrast Principle, PP = prepositional phrase, SAP = Scalar Advantage Principle, SE = standard error, UCLA = University of California Los Angeles, VP = verb phrase

## Acknowledgements

The authors would like to thank Jack Atherton, Jenny Chim, Aura Heredia Cruz, Reuben Garcia, Angela Howard, Samantha Jew, Lexi Loessberg-Zahl, Shayna Lurya, Caitlyn Wong Pickard, Ian Rigby, and Karina Ruiz for assistance running the eye tracking experiments. Portions of this research have been presented at a UC San Diego colloquium, a UMass Psycholinguistics Workshop, and the 29th CUNY Human Sentence Processing Conference; we thank the audiences for the comments and questions, especially Chuck Clifton for suggesting Experiment 2.

## Funding Information

The research reported in this publication was partially supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under grant number R15HD072713 and an Institutional Development Award from the National Institute of General Medical Sciences of the National Institutes of Health under grant number 5P20GM103436-13. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

## Competing Interests

The authors have no competing interests to declare.

## References

1. Akaike, Hirotugu. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19. 716–723. DOI: https://doi.org/10.1109/TAC.1974.1100705

2. Altmann, Gerry T. M., Alan Garnham & Yvette Dennis. 1992. Avoiding the garden path: Eye movements in context. Journal of Memory and Language 31. 685–712. DOI: https://doi.org/10.1016/0749-596X(92)90035-V

3. Cappelle, Bert, Edwige Dugas & Vera Tobin. 2015. An afterthought on let alone. Journal of Pragmatics 80. 70–85. DOI: https://doi.org/10.1016/j.pragma.2015.02.005

4. Carlson, Greg N. & Michael K. Tanenhaus. 1988. Thematic roles and language comprehension. Syntax and Semantics 21. 263–288.

5. Carlson, Katy. 2001. The effects of parallelism and prosody on the processing of gapping structures. Language and Speech 44. 1–26. DOI: https://doi.org/10.1177/00238309010440010101

6. Carlson, Katy. 2002. Parallelism and prosody in the processing of ellipsis sentences. New York: Routledge.

7. Carlson, Katy. 2013. The role of only in contrasts in and out of context. Discourse Processes 50. 249–275. DOI: https://doi.org/10.1080/0163853X.2013.778167

8. Carlson, Katy & Jesse A. Harris. 2017. Zero-Adjective contrast in much-less ellipsis: The advantage for parallel syntax. Language, Cognition, and Neuroscience 3. 77–97. DOI: https://doi.org/10.1080/23273798.2017.1366530

9. Carlson, Katy, Lyn Frazier & Charles Clifton, Jr. 2009. How prosody constrains comprehension: A limited effect of prosodic packaging. Lingua 119. 1066–1082. DOI: https://doi.org/10.1016/j.lingua.2008.11.003

10. Carlson, Katy, Michael Walsh Dickey & Christopher Kennedy. 2005. Structural economy in the processing and representation of gapping sentences. Syntax 8. 208–228. DOI: https://doi.org/10.1111/j.1467-9612.2005.00079.x

11. Carlson, Katy, Michael Walsh Dickey, Lyn Frazier & Charles Clifton, Jr. 2009. Information structure expectations in sentence comprehension. The Quarterly Journal of Experimental Psychology 62. 114–139. DOI: https://doi.org/10.1080/17470210701880171

12. Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface. In Adriana Belleti (ed.), Structures and beyond, 39–103. Oxford: Oxford University Press.

13. Chung, Sandra, William A. Ladusaw & James McCloskey. 1995. Sluicing and logical form. Natural Language Semantics 3. 239–282. DOI: https://doi.org/10.1007/BF01248819

14. Chung, Sandra, William A. Ladusaw & James McCloskey. 2011. Sluicing (:) between structure and inference. In Rodrigo Gutiérrez-Bravo, Line Mikkelsen & Eric Potsdam (eds.), Representing language: Essays in honor of Judith Aissen, 31–50. Santa Cruz, CA: UCSC Linguistics Research Center.

15. Culicover, Peter W. & Ray S. Jackendoff. 2005. Simpler syntax. Oxford: Oxford University Press. DOI: https://doi.org/10.1093/acprof:oso/9780199271092.001.0001

16. Dalrymple, Mary, Stuart M. Shieber & Fernando Pereira. 1991. Ellipsis and higher-order unification. Linguistics and Philosophy 14. 399–452. DOI: https://doi.org/10.1007/BF00630923

17. Davies, Mark. 2008. The corpus of contemporary American English: 520 million words, 1990-present. Retrieved from http://corpus.byu.edu/coca/.

18. Dickey, Michael Walsh & Ann C. Bunger. 2011. Comprehension of elided structure: Evidence from sluicing. Language and Cognitive Processes 26. 63–78. DOI: https://doi.org/10.1080/01690961003691074

19. Dixon, Wilfrid J. 1960. Simplified estimation from censored normal samples. Annals of Mathematical Statistics 31. 385–391. DOI: https://doi.org/10.1214/aoms/1177705900

20. Ferreira, Fernanda, Karl G. D. Bailey & Vittoria Ferraro. 2002. Good-enough representations in language comprehension. Current Directions in Psychological Science 11. 11–15. DOI: https://doi.org/10.1111/1467-8721.00158

21. Fillmore, Charles J., Paul Kay & Mary Catherine O’Connor. 1988. Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64. 501–538. DOI: https://doi.org/10.2307/414531

22. Frazier, Lyn. 2008. Processing ellipsis: A processing solution to the undergeneration problem. In Charles B. Chang & Hannah J. Haynie (eds.), Proceedings of the 26th West Coast Conference on Formal Linguistics, 21–32. Somerville, MA: Cascadilla Press.

23. Frazier, Lyn, Alan Munn & Charles Clifton, Jr. 2000. Processing coordinate structures. Journal of Psycholinguistic Research 29. 343–370. DOI: https://doi.org/10.1023/A:1005156427600

24. Frazier, Lyn & Charles Clifton, Jr. 1998. Comprehension of sluiced sentences. Language and Cognitive Processes 13. 499–520. DOI: https://doi.org/10.1080/016909698386474

25. Frazier, Lyn & Charles Clifton, Jr. 2001. Parsing coordinates and ellipsis: Copy α. Syntax 4. 1–22. DOI: https://doi.org/10.1111/1467-9612.00034

26. Frazier, Lyn & Charles Clifton, Jr. 2005. The syntax-discourse divide: Processing ellipsis. Syntax 8. 121–174. DOI: https://doi.org/10.1111/j.1467-9612.2005.00077.x

27. Frazier, Lyn, Lori Taft, Tom Roeper, Charles Clifton, Jr. & Kate Ehrlich. 1984. Parallel structure: A source of facilitation in sentence comprehension. Memory & Cognition 12. 421–430. DOI: https://doi.org/10.3758/BF03198303

28. Frazier, Michael, David Potter & Masaya Yoshida. 2012. Pseudo noun phrase coordination. In Nathan Arnett & Ryan Bennett (eds.), Proceedings of the 30th West Coast Conference on Formal Linguistics, 142–152. Somerville, MA: Cascadilla Proceedings Project.

29. Ginzburg, Jonathan & Ivan Sag. 2000. Interrogative investigations. Stanford, CA: CSLI Publications.

30. Griffiths, James & Anikó Lipták. 2014. Contrast and island-sensitivity in clausal ellipsis. Syntax 17. 189–234. DOI: https://doi.org/10.1111/synt.12018

31. Hardt, Daniel. 1993. Verb phrase ellipsis: Form, meaning, and processing. Philadelphia, PA: University of Pennsylvania dissertation.

32. Harris, Jesse A. 2015. Structure modulates similarity-based interference in sluicing: An eye tracking study. Frontiers in psychology 6. DOI: https://doi.org/10.3389/fpsyg.2015.01839

33. Harris, Jesse A. 2016. Processing let alone coordination in silent reading. Lingua 169. 70–94. DOI: https://doi.org/10.1016/j.lingua.2015.10.008

34. Harris, Jesse A. 2019. Alternatives on Demand and Locality: Resolving discourse-linked wh-phrases in sluiced structures. In Katy Carlson, Charles Clifton, Jr. & Janet Dean Fodor (eds.), Grammatical approaches to language processing: Essays in honor of Lyn Frazier, 45–75. New York: Springer. DOI: https://doi.org/10.1007/978-3-030-01563-3_4

35. Harris, Jesse A. & Katy Carlson. 2016. Keep it local (and final): Remnant preferences in “let alone” ellipsis. The Quarterly Journal of Experimental Psychology 69. 1278–1301. DOI: https://doi.org/10.1080/17470218.2015.1062526

36. Harris, Jesse A. & Katy Carlson. 2018. Information structure preferences in focus- sensitive ellipsis: How defaults persist. Language & Speech. DOI: https://doi.org/10.1177/0023830917737110

37. Henstra, Judith-Ann. 1996. On the parsing of syntactically ambiguous sentences: Coordination and relative clause attachment. Falmer: University of Sussex dissertation.

38. Hirschberg, Julia L. B. 1985. A theory of scalar implicature. Philadelphia, PA: University of Pennsylvania dissertation.

39. Horn, Laurence R. 1972. On the semantic properties of logical operators in English. Los Angeles, CA: University of California, Los Angeles dissertation. Distributed by the Indiana University Linguistics Club, 1976.

40. Hulsey, Sarah. 2008. Focus sensitive coordination. Cambridge, MA: MIT dissertation.

41. Katsos, Napolean & Chris Cummins. 2012. Scalar implicature: Theory, processing and acquisition. Nouveaux Cahiers de Linguistique Française 30. 39–52. DOI: https://doi.org/10.1016/j.cognition.2011.02.015

42. Katsos, Napolean & Dorothy V. Bishop. 2011. Pragmatic tolerance: Implications for the acquisition of informativeness and implicature. Cognition 120. 67–81. DOI: https://doi.org/10.1016/j.cognition.2011.02.015

43. Kehler, Andrew. 2000. Coherence and the resolution of ellipsis. Linguistics and Philosophy 23. 533–575. DOI: https://doi.org/10.1023/A:1005677819813

44. Knoeferle, Pia. 2014. Conjunction meaning can modulate parallelism facilitation: Eye-tracking evidence from German clausal coordination. Journal of Memory and Language 75. 140–154. DOI: https://doi.org/10.1016/j.jml.2014.05.002

45. Knoeferle, Pia & Matthew W. Crocker. 2009. Constituent order and semantic parallelism in on-line comprehension: Eye-tracking evidence from German. Quarterly Journal of Experimental Psychology 62. 2338–2371. DOI: https://doi.org/10.1080/17470210902790070

46. Levinson, Stephen. 2000. Presumptive meanings. Cambridge, MA: MIT Press. DOI: https://doi.org/10.7551/mitpress/5526.001.0001

47. Martin, Andrea E. 2010. Memory operations and structures in sentence comprehension: Evidence from ellipsis. New York: New York University dissertation.

48. Martin, Andrea E. & Brian McElree. 2008. A content-addressable pointer mechanism underlies comprehension of verb-phrase ellipsis. Journal of Memory and Language 58. 879–906. DOI: https://doi.org/10.1016/j.jml.2007.06.010

49. Martin, Andrea E. & Brian McElree. 2009. Memory operations that support language comprehension: evidence from verb-phrase ellipsis. Journal of Experimental Psychology: Learning, Memory, and Cognition 35. 1231–1239. DOI: https://doi.org/10.1037/a0016271

50. Martin, Andrea E. & Brian McElree. 2011. Direct-access retrieval during sentence comprehension: evidence from sluicing. Journal of Memory and Language 64. 327–343. DOI: https://doi.org/10.1016/j.jml.2010.12.006

51. Mauner, Gail, Michael K. Tanenhaus & Greg N. Carlson. 1995. Implicit arguments in sentence processing. Journal of Memory and Language 34. 357–382. DOI: https://doi.org/10.1006/jmla.1995.1016

52. Merchant, Jason. 2001. The syntax of silence: Sluicing, islands, and the theory of ellipsis. Oxford: Oxford University Press.

53. Merchant, Jason. 2004. Fragments and ellipsis. Linguistics and Philosophy 27. 661–738. DOI: https://doi.org/10.1007/s10988-005-7378-3

54. Merchant, Jason. 2008. Variable island repair under ellipsis. In Kyle Johnson (ed.), Topics in Ellipsis, 132–153. Cambridge: Cambridge University Press.

55. Merchant, Jason. 2016. Ellipsis: A survey of analytical approaches. In Jeroen van Craenenbroeck & Tanja Temmerman (eds.), Handbook of ellipsis. Oxford: Oxford University Press.

56. Mitchell, Don C., Xingjia Shen, Matthew J. Green & Timothy L. Hodgson. 2008. Accounting for regressive eye-movements in models of sentence processing: A reappraisal of the Selective Reanalysis hypothesis. Journal of Memory and Language 59. 266–293. DOI: https://doi.org/10.1016/j.jml.2008.06.002

57. Murphy, Gregory L. 1985. Processes of understanding anaphora. Journal of Memory and Language 24. 290–303. DOI: https://doi.org/10.1016/0749-596X(85)90029-4

58. Nykiel, Joanna. 2013. Wh-phrases in sluicing: An interaction of the remnant and the correlate. In Philip Hofmeister & Elizabeth Norcliffe (eds.), The core and the periphery. Data-driven perspectives on syntax inspired by Ivan A. Sag, 253–274. Stanford, CA: CSLI Publications.

59. Nykiel, Joanna & Ivan Sag. 2011. Remarks on sluicing. In Stefan Mueller (ed.), Proceedings of the HPSG11 Conference. Stanford, CA: CSLI Publications.

60. Papafragou, Anna & Niki Tantalou. 2004. Children’s computation of implicatures. Language Acquisition 12. 71–82. DOI: https://doi.org/10.1207/s15327817la1201_3

61. Paterson, Kevin B., Simon P. Liversedge, Ruth Filik, Barbara J. Juhasz, Sarah J. White & Keith Rayner. 2007. Focus identification during sentence comprehension: Evidence from eye movements. The Quarterly Journal of Experimental Psychology 60. 1423–1445. DOI: https://doi.org/10.1080/17470210601100563

62. Phillips, Colin & Dan Parker. 2014. The psycholinguistics of ellipsis. Lingua 151. 78–95. DOI: https://doi.org/10.1016/j.lingua.2013.10.003

63. Poirier, Josée, Katie Wolfinger, Lisa Spellman & Lewis P. Shapiro. 2010. The real-time processing of sluiced sentences. Journal of Psycholinguistic Research 39. 411–427. DOI: https://doi.org/10.1007/s10936-010-9148-9

64. Poirier, Josée, Matthew Walenski & Lewis P. Shapiro. 2012. The role of parallelism in the real-time processing of anaphora. Language and Cognitive Processes 27. 868–886. DOI: https://doi.org/10.1080/01690965.2011.601623

65. Rayner, Keith & Sara C. Sereno. 1994. Regression-contingent analyses: A reply to Altmann. Memory & Cognition 22. 291–292. DOI: https://doi.org/10.3758/BF03200857

66. Roberts, Craige. 1996. Information structure in discourse: Towards an integrated formal theory of pragmatics. In Jae-Hak Toon & Andreas Kathol (eds.), Working Papers in Linguistics – Ohio State University Department of Linguistics 39. 91–136.

67. Roberts, Craige. 2012. Information Structure: Towards an integrated formal theory of pragmatics. Semantics & Pragmatics 5. 1–69. DOI: https://doi.org/10.3765/sp.5.6

68. Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1. 75–116. DOI: https://doi.org/10.1007/BF02342617

69. Ross, John R. 1967. Constraints on variables in syntax. Cambridge, MA: MIT dissertation. Later published as Infinite syntax.

70. Ross, John R. 1969. Guess who. In Robert I. Binnick, Alice Davison, Georgia M. Green & Jerry L. Morgan (eds.), Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, 252–286. Chicago, IL: University of Chicago.

71. Sag, Ivan. 1976. Deletion and Logical Form. Cambridge, MA: MIT dissertation.

72. Sailor, Craig & Gary Thoms. 2014. On the non-existence of non-constituent coordination and non-constituent ellipsis. In Robert E. Santana-LaBarge (ed.), The Proceedings of the 31st West Coast Conference on Formal Linguistics, 361–370. Somerville, MA: Cascadilla Proceedings Project.

73. Sanford, Anthony J. & Arthur C. Graesser. 2006. Shallow processing and underspecification. Discourse Processes 42. 99–108. DOI: https://doi.org/10.1207/s15326950dp4202_1

74. Sanford, Anthony J. & Patrick Sturt. 2002. Depth of processing in language comprehension: Not noticing the evidence. Trends in Cognitive Sciences 6. 382–386. DOI: https://doi.org/10.1016/S1364-6613(02)01958-7

75. Sauermann, Antje, Ruth Filik & Kevin B. Paterson. 2013. Processing contextual and lexical cues to focus: Evidence from eye movements in reading. Language and Cognitive Processes 28. 875–903. DOI: https://doi.org/10.1080/01690965.2012.668197

76. Shapiro, Lewis P. & Arild Hestvik. 1995. On-line comprehension of VP-ellipsis: Syntactic reconstruction and semantic influence. Journal of Psycholinguistic Research 24. 517–532. DOI: https://doi.org/10.1007/BF02143165

77. Shapiro, Lewis P., Arild Hestvik, Lesli Lesan & A. Rachel Garcia. 2003. Charting the time- course of VP-ellipsis sentence comprehension: Evidence for an initial and independent structural analysis. Journal of Memory and Language 49. 1–19. DOI: https://doi.org/10.1016/S0749-596X(03)00026-3

78. Sperber, Dan & Deirdre Wilson. 1995. Relevance: Communication and cognition. Oxford: Blackwell.

79. Stiller, Alex, Noah Goodman & Michael Frank. 2011. Ad-hoc scalar implicature in adults and children. In Laura Carlson, Christoph Hőlscher & Thomas F. Shipley (eds.), Proceedings of the 33rd Annual Meeting of the Cognitive Science Society, 2134–2139, Boston, MA.

80. Stolterfoht, Britta, Angela D. Friederici, Kai Alter & Anita Steube. 2007. Processing focus structure and implicit prosody during reading: Differential ERP effects. Cognition 104. 565–590. DOI: https://doi.org/10.1016/j.cognition.2006.08.001

81. Sturt, Patrick, Frank Keller & Amit Dubey. 2010. Syntactic priming in comprehension: Parallelism effects with and without coordination. Journal of Memory and Language 62. 333–351. DOI: https://doi.org/10.1016/j.jml.2010.01.001

82. Takahashi, Shoichi & Danny Fox. 2005. MaxElide and the re-binding problem. In Effi Georgala & Jonathan Howell (eds.), Proceedings of Semantics and Linguistic Theory XV 233–240. Ithaca, NY. DOI: https://doi.org/10.3765/salt.v15i0.3095

83. Tanenhaus, Michael K. & Greg N. Carlson. 1990. Comprehension of deep and surface verbphrase anaphors. Language and Cognitive Processes 5. 257–280. DOI: https://doi.org/10.1080/01690969008407064

84. Toosarvandani, Maziar. 2010. Association with foci. Berkeley, CA: University of California, Berkeley dissertation.

85. Tukey, John W. 1962. The future of data analysis. Annals of Mathematical Statistics 33. 1–67. DOI: https://doi.org/10.1214/aoms/1177704711

86. Van Craenenbroeck, Jeroen. 2010. The syntax of ellipsis: Evidence from Dutch dialects. Oxford: Oxford University Press. DOI: https://doi.org/10.1093/acprof:oso/9780195375640.001.0001

87. Ward, Gregory, Richard Sproat & Gail McKoon. 1991. A pragmatic analysis of so-called anaphoric islands. Language 67. 439–473. DOI: https://doi.org/10.2307/415034

88. Weir, Andrew. 2014. Fragments and clausal ellipsis. Amherst, MA: University of Massachusetts Amherst dissertation.