The representation and processing of distributivity and collectivity: ambiguity vs. underspecification

Sentences with plural expressions can receive at least two interpretations. For example, the sentence The boys hold a balloon could mean that the boys as a group jointly hold one balloon (the collective reading) or that each boy holds one balloon, which would imply that as many balloons were held as there are boys (the distributive reading). Building on Frazier et al. (1999), we show that the human processor favors collective readings. Crucially, the preference for collective readings is only observed when the distributive reading has to be established through the means of phrasal distributivity (e.g., triggered by distributive quantifiers), and the preference disappears in case of lexical distributivity (e.g., the distributive interpretation of win ). The findings provide evidence for different mental representations of the two types of distributivity and shed light on why the processor exhibits a default preference for collective interpretations.


INTRODUCTION
Plural expressions pose a challenge to communication. Consider, for instance, a plural noun phrase like my classmates and its role in the following two sentences: (1) a. My classmates had a coffee. b.
My classmates surrounded the podium.
At first glance, it might seem that the interpretation of the noun phrase and the interpretation of the sentences in which the noun phrase appears are straightforward. The plural expression my classmates simply refers to those individuals that are my classmates, and the sentences are statements about those individuals. However, on closer inspection, (1a) and (1b) turn out to communicate very different statements. The predicate had a coffee in (1a) assigns a property to individual classmates: the sentence states that each of them had one coffee. In contrast to (1a), the predicate surrounded the podium does not assign a property to individual classmates, but to the whole group: (1b) makes a statement about my classmates as a group to the effect that together, they surrounded the podium.
Examples (1a) and (1b) include predicates that clearly distinguish between the two roles that the plurality my classmates plays in interpretation, but this is often not the case in regular conversation. For example, the sentence in (2) below makes two mutually exclusive statements. It might express that individual items cost $50, in which case all the items as a group cost more than $50, or it might express that all the items as a group cost $50, in which case the individual items cost less than that. Following the formal semantics literature, we will henceforth refer to the first interpretation as distributive and the second interpretation as collective (see Landman 1995;Nouwen 2012; appear; Champollion to appear and references therein for more details on the two interpretations).
The fact that sentences with plural noun phrases can be understood in two very different ways raises a non-trivial issue about alignment between production and comprehension. How do speakers and hearers converge on one interpretation when talking about multiple entities? And how are the two interpretations represented in grammar?
In semantic research, the dominant account of the distributive/collective ambiguity assumes that the two interpretations are structurally different. It is assumed that language has a silent operator, D, which in most instances can be paraphrased as 'each'. When the operator is present, the resulting sentence receives the distributive reading, and when it is not, the sentence receives the collective reading (Roberts 1990, Lasersohn 1995see Schwarzschild 1996 andLandman 2000 for a more nuanced picture). This position has been supported in psycholinguistic research, in particular, by findings in the processing of distributive and collective interpretations presented in Frazier et al. (1999). 1 In the study of Frazier et al. (1999), participants read sentences with a conjoined noun phrase subject and whose verb+object predicates were compatible with collective and distributive interpretations. The predicates were disambiguated towards their collective interpretation (by the adverb together) or their distributive interpretation (by the quantifier each). When the disambiguator followed the predicate, as in (3), encountering the distributive disambiguator (each) yielded more processing difficulties than encountering the collective disambiguator (together). Participants read the underlined region more slowly in (3a) than in (3b), and they regressed more often in (3a) than in (3b) from the same region.
Sam and Maria carried one suitcase each at the airport. b.
Sam and Maria carried one suitcase together at the airport.
Importantly, when the disambiguators preceded the predicate, (4), no contrast between each and together was observed. That is, (4a) did not show processing difficulties compared to (4b).
We add a third possible explanation for the pattern in (3) and (4). Frazier et al. (1999) treat the two instances of together in (3b) and (4b) as identical lexical items that are syntactically integrated in more or less the same way, and this also holds for the two instances of each in (3a) and (4a). The adverb together is, indeed, commonly treated as a manner adverb irrespective of whether it appears in a pre-VP or a post-VP position. 4 The situation for each is less straightforward. The linguistic literature takes the postverbal each in (3a) and the preverbal each in (4a) to be related but nonetheless distinct items, each associated with a specific syntax and semantics. Preverbal each is usually taken to be a VP modifier (much like together), but postverbal each, a.k.a. binominal each, is syntactically part of the NP it modifies (Burzio 1986, Safir andStowell 1988). The two quantifiers also differ in their semantics. Postverbal, but not preverbal, each places a restriction on the meaning of the direct object, requiring it to express cardinality. This restriction is the reason for the unacceptability of (5a) below: the direct object NP every movie is a universal quantifier and does not express cardinality, unlike one movie in (5b) (Szabolcsi 2011, a.o.). The sentence in (5c) shows that preverbal each places no such restrictions on the object. Differences like the one shown in (5) led semanticists to analyze postverbal each differently from preverbal each (see Zimmermann 2002;Blaheta 2003;Dotlačil 2012;Champollion 2016b).
(5) a. *The men saw every movie each. b. The men saw one movie each. c.
The men each saw every movie.
2 This parsing strategy, dubbed the Principle of Minimal Attachment, is one of the guiding principles of the Garden-Path Model (Frazier 1978).

Dotlačil and Brasoveanu
Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 With this in mind, let us again consider the argument in Frazier et al. (1999). The observed slowdown for postverbal each compared to together, as opposed to no difference between preverbal each and together, was taken to reveal the preference of the parser for the collective interpretation of the ambiguous predicate, followed by reinterpretation in (3a). However, since the two types of each have different syntactic and semantic characteristics, the slowdown observed in (3a) could be attributed to the syntactic and semantic differences between preverbal and postverbal each. For example, it could be attributed to the fact that readers have to consider the extra semantic requirements that postverbal each places on the object. That is, the observed slowdown could be due to the complexity of postverbal each.
We present two novel self-paced reading experiments whose goal is to replicate the findings of Frazier et al. (1999) with distributive disambiguators other than each, and to explain why readers exhibit a preference for collective readings.
In the first experiment, we follow the design of Frazier et al. (1999), but instead of using each as the disambiguator of distributive readings, we employ individually, which is a manner adverb and should not show a distinct behavior in its preverbal and postverbal positions compared to other manner adverbs like together. This is supported by (6), which shows that, in contrast to each, the position of individually is not sensitive to the type of object. Furthermore, (7) shows that individually, unlike each, can be conjoined with together, corroborating the claim that individually and together are both manner adverbs.
The women have (each) built the rafts (*each). b.
The women have (individually) built the rafts (individually).
(7) a. *The women built rafts each and together. b. The women built rafts individually and together.
Given the close affinity of individually to together, the effect of individually in the two positions can be directly compared to the effect of together. In other words, using only adverbs as disambiguators minimizes the risk of confounds in the current study, and should eliminate the possibility that the findings of Frazier et al. (1999) are due to a greater complexity of postverbal each compared to preverbal each.
In the second experiment, we leverage the observation made in the semantic literature (Link 1987;Roberts 1990;Lasersohn 1989;Moltmann 1997;Winter 2000;2001;Kratzer 2008;Dotlačil 2010;Champollion 2010;2016a;de Vries 2017, among others) that the D operator is only needed for non-lexical (syntactic) predicates. As an example, take (8) below. The predicate smile expresses a property that holds of single individuals, that is, (8) states that every boy smiled. However, there is no reason to assume that this reading is generated by means of a D operator. It is simply part of our lexical knowledge that smiling is a property that holds of single individuals. When we hear (8), we learn that a group of boys has the property of smiling, but because of this inherent meaning of smile, which is roughly 'the individual's lips move upwards', it follows that whenever smiling is truthfully predicated of a group, it must be truthfully predicated of all (or at least a majority of) the individuals in that group (see also Krifka 1989;Yoon 1996;Landman 2000;Kratzer 2008, a.o., for more details).
The situation is different for syntactically complex predicates like carried one suitcase in (9) below. Even if we assume that carrying is a property of atomic individuals, we would only arrive at the meaning that each boy did the carrying action, and the boys all carried one (and the same) suitcase. The issue is the quantifier one suitcase: unless we assume the presence of a D operator, we do not have a way to derive that there were multiple suitcases, one per boy. That is, without a D operator, (9) relates a group of boys and one suitcase by means of the predicate carry. The carry relation might distribute down to individual boys in the group, but it still relates each of them to the one suitcase under discussion.

Dotlačil and Brasoveanu
Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 However, we can get a distributive reading for (9) if we postulate the presence of a covert D operator. Recall that D can be roughly interpreted as each, and note that the paraphrase the boys each carried one suitcase does express that multiple suitcases were carried, one per boy. 5 The second experiment will study whether distributive/collective reinterpretation is processed differently when the predicate is lexical, as in (8), and when it is constructed in syntax, as in (9), whose distributive reading requires a quantificational element, the D-operator. As we will see, there is indeed a contrast between cases in which distributivity can be purely lexical and cases in which distributivity has to be phrasal and triggered by a quantificational element. We take that as evidence that the preference for collective readings is not due to the interpretational simplicity of collective readings, and argue that our findings are compatible with the position that the processor chooses simpler structures, i.e., structures without the D operator.

EXPERIMENT 1
The design of this experiment followed Frazier et al. (1999). We investigated the processing of sentences that had a plural entity in subject position, using plural definites, e.g., the students. Each sentence included either the adverb individually, which forces a distributive interpretation, 6 as in (10a) and (10c) below, or the adverb together, which forces a collective reading, 7 as in (10b) and (10d). The second manipulation was the position of these adverbs: they appeared either preverbally/early, as in (10a) and (10b), or postverbally/late, as in (10c) and (10d).
The girls individually wrote a sonnet after they had read Shakespeare. (early, individually) b.
The girls together wrote a sonnet after they had read Shakespeare.
The girls wrote a sonnet individually after they had read Shakespeare.
The girls wrote a sonnet together after they had read Shakespeare.
(late, together) The logic behind this experiment was identical to that of Frazier et al. (1999). If readers nonrandomly select one reading when encountering a plural NP + predication, and if a subsequent reinterpretation is costly, we expect to observe processing difficulties either in (10c) or (10d), once the disambiguating adverb following the predicate is read and integrated. Frazier et al. (1999) observed that distributive reinterpretation was costly, and accounted for this processing difficulty by assuming that the processor initially selected the collective reading. By the same reasoning, we would expect to see processing difficulties in (10c) as compared to (10d). If, however, (10d) turns out to be more difficult than (10c), we can conclude that the processor selects the distributive interpretation by default, and the reinterpretation towards a collective reading is what incurs an additional processing cost.
The early cases, (10a) and (10b), control that the processing cost is due to reinterpretation, and it is not caused by the inherent costs associated with distributive or collective readings simpliciter.

5
For a good formal explanation why we assume this contrast between lexical predicates as in (8), which do not require the D operator, and phrases/syntactically-complex predicates, which do, see Champollion (2019) and references therein. To be sure, it does not follow that all multi-word expressions require the D operator for their distributive reading, since it is possible that deriving the meaning of some multi-word expressions is not necessarily syntactically mediated. We will come back to this point in the General Discussion section ( §4).

6
See Moltmann (2005) for the semantics of individually. As Moltmann (2005) shows in detail, when individually modifies predicates specifying spatio-temporal locations, it also has an irrelevant, spatial-separation reading. We used predicates that exclude this interpretation of individually.

7
See especially Schwarzschild (1993), Lasersohn (1995) and Kratzer (2008) for the semantics of together. Strictly speaking, together does not force a collective reading, since it is compatible with both a collective and a cumulative reading. But this finer distinction between non-distributive readings is not relevant for our purposes. The only important point is that the distributive reading is excluded. The fact that together excludes the distributive reading has been experimentally confirmed (for adults) in Syrett and Musolino (2013). Dotlačil and Brasoveanu Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131

Participants
The participants were 87 undergraduate students at University of California Santa Cruz, all selfidentified native speakers of English. They received course (extra-)credit for their participation.

Procedure and items
Participants received a link to participate in the self-paced reading study; we used a noncumulative moving window self-paced reading paradigm (Just et al. 1982). 8 The experiment was run on a UCSC hosted installation of the IBEX platform. Participants took the experiment online.
After clicking on the link, each participant first saw general instructions about the self-paced reading procedure and identified whether he/she is a native speaker of English (non-native speakers could participate and receive (extra-)credit so that there would be no incentive to lie). After general instructions, participants could familiarize themselves with the methodology on three practice items. The practice session was followed by the experiment.
The experiment consisted of 40 fillers and 28 target items. An example of an experimental item in all four conditions is given in (10); see the Appendix for the list of all items. Each item consisted of four conditions. The items were rotated through the four conditions in four lists, using the standard counterbalanced Latin square design. Each participant was assigned to one of the lists; the order of the stimuli was randomized for every participant. Every item and every filler was followed by a yes-no comprehension question. Every question checked whether participants paid attention to the previous dashed sentence. For every item, the questions were identical across all four conditions. The questions were unambiguous and had a clear correct answer. For example, the question following (10) was: (11) Did the girls read Dickens?
When participants answered incorrectly, they were notified of their mistake. Yes and no responses were distributed roughly evenly (33 out of 68 questions had a yes-response).

RESULTS AND DISCUSSION
As expected, responses to questions did not pose significant problems to participants. The median correct response to fillers and target items was 95% correct, with 85 participants having at least 88% of responses correct. Two participants, however, were clear outliers, answering only 79% and 82% of the questions correctly. These participants were removed from all subsequent analyses. The final number of participants whose data we analyzed: 85.
Prior to the analysis of reading times (RTs), we removed extremely fast (<50 ms) and slow (>3000 ms) responses, as is common in analyses of self-paced reading studies (see, for example, Futrell et al. 2018). Less than 1 percent of all the data was eliminated in this way. Secondly, we log-transformed RTs to mitigate their characteristic right-skewness. We probed for (very fast or very slow) outliers among readers by checking mean logRTs per participant. No participant's mean logRT was more than 2.5 standard deviations away from the grand mean, so we retained the RT data from all 85 participants.
Following Trueswell et al. (1994), among others, we factored out the influence of word length and word position by running a linear mixed-effects regression that had intercept-only random effects for subjects and two fixed effects -word length in characters and word position in the sentence. The resulting residualized log reading times were used for all subsequent analyses.
The main regions of interest (ROIs) were the adverb, the predicate and the three words following the disambiguated predicate (the spillover region). The ROIs are underlined in (12).
The girls individually wrote a sonnet after they had read Shakespeare. b.
The girls together wrote a sonnet after they had read Shakespeare. c.
The girls wrote a sonnet individually after they had read Shakespeare. d.
The girls wrote a sonnet together after they had read Shakespeare. Dotlačil and Brasoveanu Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 The graphical summary of logRTs starting with adverbs in the early position is presented in Figure 1. LogRTs are also summarized in a table in Appendix.
For each region, we constructed a mixed-effects model with three treatment-coded fixed effects: Adverb (together or individually; the former was the reference level), Position (early or late, the former was the reference level); the interaction of adverb and Position. Following one of the recommendations in Barr et al. 2013, all our models included the maximal randomeffect structure for subject and items. Since the lme4 R package, which is commonly used for estimating mixed-effects models, did not converge with the maximal random-effect structure, we used Bayesian models estimated with Stan (Carpenter et al. 2017) and brms (Bürkner 2017). The structure of prior distributions and the sampling details of the models are provided in the Appendix.
In Figure 2, we show the posterior distributions of the fixed effects on two words, the adverb and the first word in the spillover region. We focus on these two words because these were the only cases in which the 89% credible intervals of at least one parameter exclude 0. On the adverb, we see that in the early position individually decreases residualized logRTs (posterior distribution of individually: 89% credible interval: ⟨-0.08, -0.014⟩, median = -0.05, p(β < 0) = 0.98). 9,10 We also observe an interaction of late Position:individually, which almost fully removes the speedup due to individually (posterior distribution of individually:late Position: 89% credible interval: ⟨0.005, 0.12⟩, median = 0.05, p(β > 0) = 0.96). On the first spillover word, the word after, we see that late Position leads to a slowdown in reading (posterior distribution of late Position: 89% credible interval: ⟨0.01, 0.08⟩, median = 0.04, p(β > 0) = 0.98). Second, there is a positive 9 The 89% credible interval specifies the interval in which the parameter falls with the probability of 89%. p(β < 0) = x specifies the probability that the parameter is negative is x (positive if we consider p(β > 0) = x). We can inspect all distributions but are particularly interested in posterior distributions that are predominantly positive or negative (i.e., p(β > 0) or p(β < 0) is 0.95 or higher, the credible intervals do not span zero). We present the 89% credible interval, rather than 95% credible intervals, following the suggestions and conventions proposed in Kruschke (2014) and McElreath (2020).
10 Based on the graph in Figure 1, this finding might look surprising since in the early position, individually and together show almost the same logRTs. However, the first adverb is longer by 4 letters so differing lengths mask the effect. The effect is observable when we consider residualized log-reading times. When we examine the model that uses raw (non-residualized) logRTs as the dependent variable, we do not observe any speed-up due to adverb type.  interaction of late Position and individually (posterior distribution of late Position:individually: 89% credible interval: ⟨0.02, 0.11⟩, median = 0.063, p(β > 0) = 0.99). No other region of interest shows a posterior distribution of a parameter that is overwhelmingly (p(β) > 0.95) positive or negative.
Focusing on the effect observed on the spillover, we see that postverbal adverbs lead to a slowdown in reading, and the slowdown is even more pronounced when the postverbal adverb is the distributive disambiguator individually. The latter effect can also be observed on the adverb, even though the interaction individually:late Position is less dominantly positive on that word. Both effects are replications of Frazier et al. (1999).
The slowdown caused by postverbal adverbs on the first spillover word is orthogonal to our main research question. It possibly reflects the cost associated with adverb-predicate integration. No such cost would be observable for preverbal adverbs, since these were read before the predicate and could have already been integrated while reading the predicate itself. The effect might alternatively be due to the fact that adverbs are usually preferred and more frequent in preverbal than postverbal positions (see the frequency data in Ko 2016). While the preference is well-known for subject-oriented and speaker-oriented adverbs (e.g., Haider 2004), the results of the present experiment suggest that the preference might also hold for manner adverbs such as together and individually.
More importantly for us, the postverbal-position slowdown is further modulated by the positive interaction late:individually. This shows that the postverbal adverb disambiguating towards the distributive reading incurs a processing cost above and beyond any processing cost caused by postverbal adverbs in general. The observed interaction is compatible with the position that readers select the collective interpretation of ambiguous predicates, and that the distributive reinterpretation is costly. The findings generalize the results of Frazier et al. (1999) beyond the disambiguator each, and are incompatible with the hypothesis that the processing difficulty associated with postverbal each can be fully explained by the semantic complexity of binominal each.
There are two alternative explanations that we can discard. First, one could think that the observed slowdown caused by the postverbal adverb individually is just due to some general cost associated with this adverb (e.g., because the adverb individually is less frequent and morphologically more complex than together). If this was so, however, we would also expect individually to be read more slowly than together in preverbal position, which does not actually happen. In fact, we observed the exact opposite: individually is read faster than together preverbally when we control for word length. Furthermore, the effect of preverbal adverb type does not spill over to any of the words on the predicate wrote a sonnet, showing again that individually is not intrinsically harder to process than together.
The second alternative explanation we can discard is that distributive readings themselves cause processing difficulties, rather than the reanalysis towards distributivity. It would follow from this second alternative explanation that the early disambiguator towards distributivity should also be a cause of processing difficulties, again contrary to the experimental results.
We thus conclude that the cost of reinterpretation from collectivity to distributivity is genuine. What remains unclear is why the processor should prefer collective readings. The second experiment addresses this question.

EXPERIMENT 2
After excluding the option that the results of Frazier et al. (1999) are simply due to the complexity of postverbal each, we are left with two possible explanations for why the processor should prefer collective over distributive interpretations. One explanation is that the processor is sensitive to structural economy considerations and prefers structures without the D operator. Alternatively, the processor is sensitive to interpretational economy/complexity considerations, and prefers collective readings because they are simpler in the sense that they introduce fewer entities, or other simpler interpretational objects compared to distributive readings. For example, in the sentence Two boys carried one suitcase, the collective reading postulates just one event and one suitcase, whilst the distributive reading postulates two events and is compatible with there being two suitcases, such that each event+suitcase is associated with one of the two boys. Since the collective reading postulates fewer events, and possibly also fewer objects, than the distributive reading, the preference for interpretational economy selects the collective reading.
In semantic research, it has been argued at least since Scha (1981) that the D operator is not needed across the board to obtain distributive readings for any predicate. Consider (13).
The girls slept.
The predicate meet does not express a property of single individuals but a property of groups. Under most semantic theories, it is assumed that the predicate applies to some kind of plurality -a plurality of girls in (13a). The lexical meaning of the predicate will ensure that even though the predicate applies to the group, we learn something about individual girls when hearing (13a) -for example, that every girl, or at least a majority of the girls, was at the same location.
A similar analysis can be given to (13b). Even though we deal with a predicate that expresses a property of individuals (sleep), it is possible to assume that the predicate sleep applies to the plurality of girls. Due to the lexical meaning of the predicate, we learn something about individual girls from (13b) -that is, that all (or at least most of) the girls are asleep. This is sometimes dubbed a vagueness account of distributivity (cf. Winter 2000), or an underspecification account (cf. Champollion to appear). We will use the latter label.
It is hard to see how the underspecification account could generalize to examples like (14). (14) The girls slept on a narrow bed.
Even if we assume that sleep can apply to a group and express that every individual, or most individuals, sleep, as we did above, we would only derive the meaning that there was one narrow bed and every girl slept on it. However, the distributive reading of (14) is more appropriately paraphrased as 'the girls each slept on a narrow bed' and requires, in the pragmatically most plausible interpretation, that there were as many beds as there were girls.
More generally, the underspecification account derives distributivity as a by-product of the lexical meaning in (13b). The same strategy, however, would only work for (14) if we assume that the whole predicate slept on a narrow bed is stored in the lexicon. If it is not, something else has to be responsible for deriving the distributive reading of (14).
That something else, so the semantic literature argues, is phrasal distributivity, triggered by the distributivity operator, or the D operator for short. In most accounts (see Champollion to appear for a recent summary), the D operator appears at the VP level and modifies the whole predicate. In (14), this would be the predicate slept on a narrow bed. For our purposes, it suffices to say that the operator is interpreted in the same way as each (Heim et al. 1991;see Schwarzschild 1996 for a more nuanced view), i.e., the D-enriched sentence (14) can be paraphrased as 'the girls each slept on a narrow bed'. This paraphrase correctly highlights that the distributive reading expresses that each girl slept on a different narrow bed, and there were as many beds as girls.
Before proceeding, let us reiterate one crucial issue about D operators in relation to (14). We need the D operator not because the verb sleep is understood distributively, but because the whole predicate sleep on a narrow bed is so interpreted -in particular, narrow beds have to covary with the girls. The distributive interpretation of this whole complex VP poses a challenge to the underspecification account, since the underspecification account should operate only on lexical units. In contrast to that, other VPs that are complex, but that receive a distributive interpretation exclusively driven by their verbs, would not be problematic for the underspecification account. Consider the following example: The boys heard a noise.
The most likely interpretation is that there was one particular noise, but each boy heard the noise individually, on his own. The paraphrase signals that hearing is distributive. Yet, since the noise does not covary with the boys, this paraphrase can be captured by assuming that hear is specified as signaling the relation between individuals (each individual who hears) and some object (what is heard), and the distributive reading arises through the lexical means. That is, since the distributivity is fully located on the verb only, no D operator is needed. See Dotlačil and Brasoveanu Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 Champollion (2019) and references therein for a more detailed discussion of lexical and phrasal distributivity.
Let us now see how the phrasal and lexical distributivity might help us explain why we observe the preference for collective readings in processing. If the parser preferred simpler structures, i.e., structures lacking a null operator such as the D operator, we should see processing preferences for collective readings of syntactically-built predicates as in (14), but we should not see such preferences for one-word predicates, (13). On the other hand, if the parser generally preferred collective readings, we should see across-the-board processing preferences for collectivity, whether the predicates are syntactically-assembled (14) or not (13b). For example, if the parser preferred interpretations with fewer semantic objects, then the parser should preferentially select the collective reading of both (14) and (13b), since in both cases the collective interpretation postulates only one collective event, unlike the distributive reading.
To test these two hypotheses, we designed a self-paced reading experiment with 8 conditions, exemplified in (16). The predicate could be one word only and the reading could be established based on the lexical knowledge, or the predicate consisted of several (3 or 4) words. In the first case, the predicate was a verb, e.g., won. In the second case, the predicate was the same verb followed by an object, e.g., won an award. For all the items, the verb was chosen in such a way that it could be followed by a direct object, but an intransitive version in which the object is absent would also be grammatical.
Both collective and distributive readings were possible with the chosen predicates, in both the transitive and the intransitive version. As in Experiment 1, the predicate was disambiguated as collective or distributive by two adverbs, namely collectively and individually. We use collectively instead of together since collectively matches the other disambiguation adverb, individually, more closely in length and morphological structure, while still forcing the non-distributive reading. The disambiguation could happen either before or after the predicate. All conditions are presented in (16).
The girls individually won an award during the science fair. early, individually, object Present b.
The girls collectively won an award during the science fair. early, collectively, object Present c.
The girls won an award individually during the science fair. late, individually, object Present d.
The girls won an award collectively during the science fair. late, collectively, object Present e.
The girls individually won during the science fair. early, individually, no object f.
The girls collectively won during the science fair. early, collectively, no object g.
The girls won individually during the science fair. late, individually, no object h.
The girls won collectively during the science fair. Late, Collectively, No object As the items were relatively complex, we also included an acceptability subtask into the experiment. This was used to check that predicates with and without objects were equally acceptable in the distributive and the collective disambiguation, i.e., both interpretations were equally possible with either predicate.
Let us summarize our predictions for the self-paced reading part of the experiment. If the preference for collectivity is driven by structural considerations, we expect the processor to choose collective readings only for syntactically complex predicates, i.e., when the direct object is present. Consequently, the reinterpretation towards distributivity should incur processing costs only for object-present cases. When the object is absent, the processor has no reason to prefer collective readings over distributive ones. That is, we expect a three-way interaction between adverb position, adverb type and the presence of a direct object. However, if the preference for collective readings is driven only by considerations of interpretational complexity measured in terms of the number of entities postulated in the semantic model, we expect late 11 Dotlačil and Brasoveanu Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 distributive disambiguation to cause processing difficulties regardless of the presence of the object. This is because the collective reading of any predicate postulates fewer entities, i.e., fewer events, than the distributive reading of the same predicate. Thus, a two-way interaction between the position of the adverb and the type of adverb is predicted in this case.

Participants
The participants were 55 undergraduate students at University of California Santa Cruz, all selfidentified native speakers of English. They received course (extra-)credit for their participation.

Procedure and items
As in Experiment 1, the study was run online. Participants received a link that directed them to the experiment, created on the IBEX platform and hosted on UCSC servers. As in Experiment 1, we used the non-cumulative moving window self-paced reading paradigm.
The participants first received general instructions, which were followed by three practice items and the experiment. The study consisted of 32 target items and 96 fillers. Every item had 8 conditions. An example of an experimental item in all eight conditions is given in (16); see the Appendix for the list of all items. The items were distributed in eight lists using the standard counterbalanced Latin square design. Each participant was assigned to one of the lists and was presented the stimuli (items and fillers) in random order. Every item and every filler was followed by a yes-no comprehension question and, unlike in Experiment 1, by an acceptability judgment of the self-paced reading sentence. Yes-no comprehension questions checked whether participants paid attention to the previous self-paced reading sentence. For every item, the questions were identical across all conditions. The questions were unambiguous and had a clear correct answer. For example, the question following (16) was: (17) Were the girls at the science fair?
In the acceptability judgment subtask, participants had to rate on a five-point Likert scale ⟨-2, -1, 0, 1, 2⟩ the acceptability of the sentence they had just read. The sentence did not reappear for the acceptability judgment task; that is, participants had to rely on their memory to make the judgment. No answers were considered as correct or incorrect for the acceptability judgment task, and this was also stated in the introductory instructions. However, the comprehension questions had correct answers, and whenever a participant made a mistake in answering a comprehension question, they were directly notified of that.

RESULTS AND DISCUSSION
The comprehension questions were not difficult for participants: the average accuracy was 90% of answers correct. However, two outlier participants answered only 73% and 66% of questions correctly, so they were excluded from subsequent analyses. All the other participants answered at least 79% of the questions correctly.
The descriptive summary of the acceptability data is provided in fixed effects: object (absent or Present, the former was the reference level), adverb (collectively or individually, the former was the reference level), Position (early or late, the former was the reference level); the interaction of adverb and Position; the interaction of object and Position; the interaction of adverb and object; and the three-way interaction adverb, Position, object. The model included the full random-effect structure for subjects and items. The Bayesian model was estimated using Stan and brms (cf. Bürkner and Vuorre 2019). The structure of prior distributions and the sampling details of the model are provided in the Appendix.
Posterior distributions are graphically summarized in Figure 3. Negative values indicate decreased acceptability, while positive values indicate increased acceptability. Only one parameter has a posterior distribution with a 89% credible interval that excludes 0: the negative interaction of late Position:object (89% credible interval: ⟨-0.69, -0.15⟩, median = -0.41, p(β > 0) = 0.992). That is, sentences with adverbs following multi-word predicates were judged as worse, irrespective of adverb type. We take this to once again indicate that English speakers disprefer adverbs in postverbal position. This finding from the acceptability subtask matches one of the findings in Experiment 1, where increased RTs were observed for postverbal adverbs (recall that in Experiment 1, all predicates consisted of a verb + an object). 89% credible intervals of all the other parameters include zero. In particular, no factor with individually shows a fully negative or a fully positive credible interval, suggesting that readers did not find distributive readings of the predicates in the experiment as less or more acceptable than collective readings. This suggests, in turn, that items in this experiment were just as compatible with distributive construals as they were with collective construals. Thus, if we are to observe any difficulties with distributive readings when we turn to incremental processing in the self-paced reading part of the study, we can be reasonably confident that these are not due to the decreased acceptability of the distributive or collective interpretation of the predicates. 11 For the analysis of RTs, we proceeded in the same way as in Experiment 1. We removed RTs that were faster than 50 ms and slower than 3000 ms, and we log-transformed RTs. As in 11 This point should be kept in mind when considering all the items in the experiment. While both the distributive and the collective reading are easily available for win/win an award, our example in (16), other predicates used in the experiment might intuitively be harder to interpret under the collective reading, e.g., ran/ ran a mile (see the Appendix for the full list of items). However, the collective reading seems possible in this last case as well, as witnessed by the fact that the predicate can combine with collectively, irrespectively of the presence of the object. The collective reading signals that the activity is shared in close time-space proximity, and the people participating in it likely share the same goal, in contrast to the distributive version of ran/ran a mile (see also the discussion in Lasersohn 1995 of the role of proximity in the interpretation of the antidistributive adverb together). In any case, what is crucial for us is that the version without an object does not differ from the version with an object with respect to the plausibility of the reading forced by collectively. The fact that predicates do not differ from each other in this respect is confirmed by the acceptability study. Experiment 1, we probed for (very fast or very slow) outliers among readers by checking mean logRTs per subject. Two subjects had a mean logRT more than 2.5 standard deviations away from the grand mean and were removed from the final analysis of RTs. The final number of participants whose data we analyzed: 51. The influence of word length and word position was also factored out in the same way as in Experiment 1.
The main regions of interest (ROIs) were the adverb, the predicate and the three words following the disambiguated predicate, the spillover region. The ROIs are underlined in (18).
The girls individually won an award during the science fair. b.
The girls collectively won an award during the science fair. c.
The girls won an award individually during the science fair. d.
The girls won an award collectively during the science fair. e.
The girls individually won during the science fair. f.
The girls collectively won during the science fair. g.
The girls won individually during the science fair. h.
The girls won collectively during the science fair.
The graphical summary of logRTs is presented in Figure 4. LogRTs are also summarized in a table in the Appendix.
For each region, we estimated a Bayesian mixed-effects model with the same fixed effects as in the acceptability study (adverb, Position, object and their interactions). The models had a maximal random-effect structure and were estimated using Stan and brms. The structure of prior distributions and the sampling details of the models are provided in the Appendix.
Posterior distributions of the parameters on the adverb and spillover regions are summarized graphically in Figure   The slowdown due to late Position is most likely related to the slowdown caused by late adverbs in Experiment 1 and, possibly, to the interaction of late adverb:object in the acceptability study. These results repeatedly show that readers incur a processing cost when integrating adverbs that appear in a post-predicate position. The effect, while interesting on its own, is not of primary concern to us. Another effect stands out in the same region: the interaction late Position:individually (89% credible interval: ⟨-0.21, -0.04⟩, median = -0.13, p(β < 0) = 0.991). This signals that late disambiguation towards distributivity with lexical predicates (i.e., predicates that consist just of one word, the intransitive verb) speeds up reading. The speed-up is diminished in the case of phrasal predicates, since most of the posterior probability mass of the three-way interaction object Present:late Position:individually is positive (median = 0.067, p(β > 0) = 0.82).      (18)), shows that object Present clearly affects reading times by causing a speed-up (89% credible interval: ⟨-0.12, -0.03⟩, median = -0.075, p(β < 0) = 0.995). This effect is probably caused by the fact that objects add extra information and increase the predictability of the follow-up regions. This facilitating role of direct object is orthogonal to our main investigation. We also see that object Present:late Position is associated with a slowdown (89% credible interval: ⟨0.004, 0.14⟩, median = 0.074, p(β < 0) = 0.961), which matches the decreased acceptability of the same condition.
Finally, Figure 8, corresponding to the third word of the spillover (the word science in (18)), shows two conditions that clearly stand out. First, there is a negative interaction of late Position and adverb individually (89% credible interval: ⟨-0.18, -0.05⟩, median = -0.114, p(β < 0) = 0.997). Second, there is a dominant positive effect associated with the three-way interaction object Present, late Position and adverb individually (89% credible interval: ⟨0.07, 0.28⟩, median = 0.19, p(β > 0) = 0.998). Thus, while the late disambiguation towards distributivity with lexical predicates (i.e., predicates that consist just of one word, the intransitive verb) speeds up reading, the effect is clearly reversed when the predicate consists of the verb + an object. In that case, disambiguation towards distributivity slows down reading. The effect can also be clearly observed in Figure 9 on the word science (the last word in the figure).  Exp. 2, LogRTs on the third word of the spillover, i.e., the third word after the predicate + the adverb (the word science in (18)). The graph shows a three-way positive interaction between adverb, Position and object: late distributive disambiguation speeds up reading when the object is null (left panel), but slows down reading when the object is present (right panel).

GENERAL DISCUSSION
Based on the two experiments, we can conclude that readers have processing difficulties when they read a predicate that is in principle compatible with both collective and distributive interpretations, and they learn afterwards that the predicate should be interpreted distributively.
The findings extend the results of Frazier et al. (1999) in two significant ways. First, we found that the slowdown is not caused just by the postverbal disambiguator each, but appears with another distributive disambiguator, the adverb individually. This is important to establish since each in postverbal position affects interpretations in other ways than just enforcing distributivity. Observing the effect with the disambiguator individually gives us stronger evidence that it is really the reinterpretation towards distributivity that incurs processing cost.
Second, we saw that the distributive reinterpretation of predicates consisting of a verb + an object is costly. But the distributive reinterpretation of one-word predicates (intransitive verbs) is not harder than the collective reinterpretation. This second finding fits well with formal semantics research arguing that, in the case of one-word predicates, the distributive/collective reading can just follow from our lexical knowledge, but in the case of complex predicates, that is, multi-word expressions whose interpretation is syntactically mediated, something more has to be said -specifically, the presence of a D-operator is needed. 12 Thus, the distributive reinterpretation is costly only when distributive interpretations are triggered by a covert quantificational element that can take scope, the D-operator. Our findings match well with research on structural priming of distributive and non-distributive readings of pluralities, which provides evidence that the two readings involve different mental and structural representations (Maldonado et al. 2017). We will now discuss in detail what our results mean for the theory of distributivity and collectivity, and the processing of these readings. Frazier et al. (1999) use their results to argue that the distinction between collectivity and distributivity is a case of ambiguity, not underspecification. This theoretical conclusion is based on the results of Frazier and Rayner (1990), which show that the human processor encounters difficulties (observable as increased reading times) when an ambiguous word (a homonym) is introduced first and disambiguated later towards its dispreferred reading. Crucially, no such difficulty is found when a word with multiple senses is introduced first and one of its senses is instantiated later, regardless of the sense that is instantiated. The question whether the distributive/collective distinction is a matter of ambiguity or underspecification has not been fully resolved in theoretical linguistics (see Lasersohn 1995 andSchwarzschild 1996 for opposing views, and Nouwen 2012; Champollion to appear for a summary of the issues) and thus, processing data might be very important in this debate.

AMBIGUITY
Our experiments add an important qualification to Frazier et al. (1999). Given the results of Experiment 2, we have evidence that phrasal predicates, i.e., predicates constructed in syntax, bias readers towards collective readings, while lexical predicates (consisting of only one word) exhibit a preference for distributive readings. One possibility is that we deal with ambiguity for both predicate types, but each type establishes different preferences (see below as to why the human processor might prefer one reading over the other).
Alternatively, it could be the case that only phrasal predicates are truly ambiguous. Lexical predicates (like the predicate won) are underspecified, and the observed preference for distributivity has a different source. For example, world knowledge makes the predicates used in the experiment more easily compatible with their distributive interpretation than the collective one (see Scontras and Goodman 2017 for a formally explicit account that could model this world-knowledge driven preference). This explanation would go against Frazier and Rayner (1990) and suggest that there are, after all, cases of underspecification in which reinterpretation towards one of the senses can affect reading times. The explanation is compatible with the Dotlačil and Brasoveanu Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 work in theoretical linguistic research arguing that only phrasal distributivity/collectivity is a case of (structural) ambiguity.
More generally, our data provide processing evidence that phrasal and lexical distributivity differ from each other. Theoretical work that assumes two different mechanisms for the two types of distributivity are compatible with our findings. This is true for the theories that postulate a D operator for phrasal distributivity, while they let the distributivity of lexical predicates arise through lexical means, e.g., the vagueness/underspecification of the lexical element (Link 1987;Roberts 1990;Lasersohn 1989;Moltmann 1997;Winter 2000;2001;Kratzer 2008;Champollion 2010;Dotlačil 2010;Champollion 2016a;de Vries 2017; see Glass 2018 and Champollion 2019 for discussion). The theories that postulate only one mechanism for both types of distributivity (Landman 1989;Schwarzschild 1996;Beck and Sauerland 2000;Glass 2018) would need a different explanation for why phrasal and lexical cases exhibit a different processing profile.

WHAT TRIGGERS THE PREFERENCE FOR COLLECTIVE READINGS
We mentioned two possible explanations for why readers prefer the collective interpretation of a predicate that could be interpreted as distributive or collective: (i) the processor might prefer the simplest structure, and selects the collective reading because the structure that generates this reading lacks the D operator; (ii) alternatively, the processor selects the collective reading because its interpretation is less complex in that it postulates fewer events and objects.
Our results are compatible with the first explanation. We saw that readers choose the collective reading of a predicate when the distributive reading would require the D operator, preferring a simpler syntactic structure without the operator.
But the second explanation is not immediately compatible with our results. In particular, we would have to explain why readers prefer the collective reading of a multi-word predicate -a verb + an object such as won an award -but they do not prefer the collective reading of a one-word predicate -the verb won by itself. Specifically, in the case of multi-word predicates, the distributive reading leads one to assume that there are multiple objects (multiple awards that were won) and multiple winning events, one per person + award. The latter assumption follows from a standard requirement in event semantics that every event should have uniquely specified thematic roles (Parsons 1990). Under the interpretational-simplicity hypothesis, this difference between collective and distributive readings leads readers to prefer the collective interpretation. But the very same hypothesis makes incorrect predictions for oneword predicates like the verb won. The distributive reading of this predicate also establishes multiple winning events (one per agent), while the collective reading requires only one event (since there is only one agent, the group that won). Given that, we would expect that lexical predicates like won should also exhibit a preference for collective interpretations, which has not been observed in our data.
The proponents of this second explanation/hypothesis could argue that the contrast between multi-word predicates and one-word predicates is due to the different 'ambiguity status' of the two predicate types. If the processor has to choose one of the readings only for ambiguous predicates (as argued in Frazier and Rayner 1990), and if only multi-word predicates are ambiguous, we could maintain that the processor is driven by simplicity in interpretation.
The results in the second experiment would follow because only in the case of multi-word predicates does the processor have to explicitly represent the complexities of distributive/ collective readings. That is, the processor has to make a choice only in the case of phrasal predicates (which are truly ambiguous).
We see this as a potential explanation for part of our data. It is still unclear under this account why one-word predicates like the intransitive verb won should show any preference for either reading, in particular, a preference for distributive interpretations. After all, if such predicates are vague with respect to collectivity/distributivity, readers should not be forced to choose either interpretation, according to Frazier and Rayner (1990). Of course, one could give up this assumption and conjecture that readers make a choice even in the case of underspecified predicates like won. But then it is unclear why readers do not choose the collective reading here, just as they do in case of multi-word predicates -particularly since the collective interpretation of won requires fewer events than its distributive interpretation.
If we assume that the processing preference is driven by the preference to have structures without the D operator, then we straightforwardly explain why the processor prefers the collective reading for multi-word predicates like won an award. This position is, furthermore, compatible with the finding that one-word predicates are preferably interpreted as distributive. The reason is that collective and distributive readings of one-word predicates cannot be distinguished structurally, and therefore, other factors like pragmatic considerations, world knowledge etc., can and will greatly affect a reader's choice. 13 Stepping back, we see that only predicates constructed in syntax incur a processing cost when they have to be reinterpreted as distributive. There are two possible ways to explain this fact. The first option is to interpret the data as showing that the choice of the preferred interpretation is driven by simplicity in syntax. The second option is to interpret the data as showing that the processor only has to make a choice when it encounters a predicate that was constructed in syntax. Under either interpretation, it becomes clear that the parser has to be sensitive to the distinction between predicates constructed in syntax, i.e., predicates that have to carry the quantificational D operator for their distributive reading, and predicates that can carry their distributive/collective reading in their lexical meaning.

MULTI-WORD EXPRESSIONS AND LEXICAL DISTRIBUTIVITY
We took the results of Experiment 2 to indicate that syntactically-complex predicates exhibit different biases during incremental interpretation than predicates that are not syntactically complex. Strictly speaking, though, we just see that multi-word predicates (won an award) show different preferences than one-word predicates (won). This brings up one potential confound in our study: could it be that we observe the three-way interaction between adverb, Position and the presence of object just because the multi-word predicates are longer (they consist of at least three words) than one-word predicates? It is known that the cost of reanalysis towards the marked interpretation is more pronounced when readers are allowed a longer time to build and commit to the preferred, default interpretation (Frazier and Rayner 1982). Under this explanation, the observed three-way interaction could simply be a consequence of the longer span during which the default collective reading has been built up (i.e., the reading has become entrenched).
But this hypothesis cannot account for the fact that, in the case of one-word predicates, we see the exact opposite effect: readers prefer late disambiguation towards distributive, rather than collective, readings. This is unexpected if certain predicates are preferably understood as collective, and only the length of these predicates modulates the cost of reanalysis towards the marked interpretation.
A related question is whether there would be any way to distinguish between predicates that consist of several words and predicates that are constructed in syntax. Are there multi-word predicates that should be seen as single units from the perspective of semantics, and that do not need quantificational elements (like the D-operator) to resolve distributive/collective ambiguities? Addressing this question opens up an interesting research area where the study of plurality, processing and the lexicon/syntax interface interact. Clearly, though, this research program goes beyond the topic of this article. We only note that, in the semantic literature, it has been proposed that some multi-word predicates might be treated as atomic units 13 Alternatively, one-word predicates might in general prefer distributive readings. That is, once we remove structural considerations like the requirement for the D operator, the distributive reading is preferred by the processor.
Notice that the acceptability study argues against the possibility that we inadvertently introduced a contrast between one-word and multi-word predicates in Experiment 2, such that predicates in the former group have a plausibility bias towards distributivity, and predicates in the latter group have a bias towards collectivity. If there was a collectivity bias for multi-word predicates, we would expect that individually should cause degraded acceptability when the object is present, as individually would clash with this hypothetical bias of the multi-word predicate. However, no such decrease in judgments is observed in the acceptability study. The adverb individually seems to be just as compatible with multi-word predicates in Experiment 2 as collectively is. In other words, both interpretations are equally acceptable, suggesting no such plausibility bias is present. We would like to thank an anonymous reviewer for discussion of this point. Dotlačil and Brasoveanu Glossa: a journal of general linguistics DOI: 10.5334/g jgl.1131 and, consequently, that such predicates might lack a preference for collectivity. Idiomatic expressions like take a nap and predicates with bleached verbs like have a coffee belong here (see Vries 2015). 14

PHONOLOGICALLY NULL SEMANTIC OPERATORS
One issue that has sparked interest in semantic and psycholinguistic literature is whether realtime studies of incremental interpretation can provide processing evidence for null semantic operators, and for logical-form displacements that need to be posited for interpretational reasons (Pylkkänen and McElree 2006). This research line has mostly focused on the processing of quantifier scope (see Brasoveanu and Dotlačil to appear for a recent summary) and argument and aspect coercion (see Piñango et al. 2006;Pickering and Ferreira 2008).
In this paper, we turned our attention to the incremental processing of pluralities. The results of the two reported self-paced reading studies show that there is a difference between the processing of phrasal vs. lexical distributivity, and that this contrast aligns well with the theoretical research that postulates a null semantic operator in the former, but not in the latter, case. Insofar as this interpretation is correct, it can be seen as providing further support for the research line pursued in Pylkkänen and McElree (2006). Null semantic operators impact processing, and real-time/online methodologies like self-paced reading, which provide a window into the nature of incremental processing, can be used as an independent source of evidence for the presence of null semantic operators in an LF structure.

CONCLUSION
We studied the interpretation of plural expressions like the students, focusing on the question of how the processor establishes the distributive/collective interpretation of predicates accompanying such plural expressions.
We reported the results of two self-paced reading experiments. In the first experiment, we built on Frazier et al. (1999) and replicated their finding that the human processor favors collective readings over distributive ones. Based on the results of the second experiment, we argued that the preference for collective readings is only observed when the distributive reading arises via phrasal distributivity, and the preference disappears in the case of lexical distributivity.
The findings provide evidence that the distinction between collective and distributive readings is a case of ambiguity, at least for phrasal predicates. They also indicate that the processor might give more weight to some sources of evidence during incremental interpretation. Specifically, syntactic information and structural simplicity might contribute stronger processing constraints than mental model simplicity, at least when this simplicity/complexity is measured in terms of the number of events or entities that the processor needs to postulate during incremental interpretation.

ADDITIONAL FILE
The additional file for this article can be found as follows: • Appendix.