While ‘most’ and ‘more than half’ are generally assumed to be truth-conditionally equivalent, the former is usually interpreted as conveying greater proportions than the latter. Previous work has attempted to explain this difference in terms of pragmatic strengthening or variation in meanings. In this paper, we propose a novel explanation that keeps the truth-conditions equivalence. We argue that the difference in typical sets between the two expressions emerges as a result of two previously independently motivated mechanisms. First, the two expressions have different sets of pragmatic alternatives. Second, listeners tend to minimize the expected distance between their representation of the world and the speaker’s observation. We support this explanation with a computational model of usage in the Rational Speech Act framework. Moreover, we report the results of a quantifier production experiment. We find that our account can explain the difference in typical proportions associated with the two expressions.

According to the standard analysis of ‘most’ and ‘more than half’, the sentences ‘most cats sleep’ and ‘more than half of the cats sleep’ are truth conditionally equivalent. More generally, ‘most As are B’ and ‘more than half of the As are B’ are verified by the same As and Bs. ‘Most As are B’ is analysed as conveying that the size of

(1) | ⟦most⟧( |

(2) | ⟦MTH⟧( |

In contrast to this assumption, the behaviours of ‘most’ and ‘more than half’ differ. Early work has focused on the different behaviour of the two expressions with respect to their upper bounds (

First,

On the other hand,

In this paper, we give the first fully fleshed-out version of the pragmatic strengthening hypothesis. We argue that two independently needed mechanisms in the interpretation of quantifiers suffice to predict the difference between the two expressions without assuming a difference in scale structures or truth conditions. The first mechanism is the tendency of the listener to guess points central to a category to minimize the expected distance between their own guess and the speaker’s observation. The second mechanism is the structural theory of conceptual alternatives, which lets the alternative set of an utterance depend on the structure of the concept conveyed by the utterance. We show that these mechanisms make the correct predictions with a computational model of pragmatics, the Rational Speech Act model. We support our proposal with experimental data, showing that a hierarchical Bayesian model implementing our account can fit the production quantifier data.

Solt (

While the precise proportions reported in Solt (

Solt (

(3) | ⟦most⟧(_{S}_{S} |

(4) | ⟦MTH⟧(_{S}_{S} |

Where _{S}

Solt (

Solt accounts for the difference in the upper bounds of the two expressions with a difference in the scalar implicatures they generate, still due to the two scale-types. Solt points out that ‘more than half’ has a rich set of alternative utterances, including ‘more than two thirds’ and ‘more than three quarters’. On the other hand, the alternative utterances to ‘most’ are more sparse, including ‘all’. Since the set of alternative utterances is more fine-grained for ‘more than half’ than for ‘most’, scalar implicatures constrain the upper bound of the former to be lower than the latter. Solt proposes that the two expressions have different sets of alternatives because each expression only alternates with expressions that use the same scale type.

Solt’s account, as summarized in this section, relies on a semantic difference between ‘most’ and ‘more than half’, specifically in the structure of the scales they use. As such, Solt (

In this section, we present our account informally, before formalizing it in Sections 4 and 5. Our account explains why ‘most’ and ‘more than half’ are typically used to convey different proportions, based on two mechanisms. The first is the idea that the listener attempts to minimize the difference between their guess and the speaker’s observation. The second is the fact that different conceptual structures cause different sets of alternatives. We next consider these two mechanisms in turn.

The members of many semantic domains, such as numbers, colors, or proportions, enter in relations of similarity to each other. For instance, two shades of blue can be closer to each other than either of them is to a shade of red. On the other hand, some semantic domains, such as nationality, football teams, or personal identity, are not usually structured by similarity relations. For instance, it is nonsensical to claim that Billy the Kid is closer, in terms of his identity, to Jesse James than Doc Holliday.

In many cases, when communication happens in domains structured by similarity, and the listener’s task is to construct a representation of the world state given a description produced by the speaker,

This measure of communicative success has several motivations. First, if there are finitely many signals but infinitely many possible observations, the probability that the speaker’s observation coincides with a guess by the listener is 0 (except for at most a finite set of possible observations, such as the extremes of the scale). This is the case of the scale of proportions, which is the focus of this paper. Another example is the scale of heights: if a speaker observes that John has height

In this perspective, it is sensible for a listener to not simply sample from the set of possible world states given their probability after receiving the message, but rather to minimize the expected distance between their guess and the true world state. For instance, if the speaker utters ‘blue’, the listener might select a shade of blue that is located around the center of the blue category, because a point near the center of the category will have a lower expected distance to the true world state than a point that is around the margin of the category. Previous literature supports this idea that listeners tend to guess the center of a category when communicative success depends on the similarity between true state and listener’s guess, e.g., Jäger et al. (

Consistently with the previous literature (See Chapter 2 of Carcassi (

The listener’s tendency to guess a state that minimizes the expected distance to the speaker’s observation, when in a scalar semantic domain, is not only a result about rational agents, but also aligns with the way we use quantifiers in practice. For instance, imagine receiving the signal ‘between 50 and 100’, and creating a representation of the world state. Even within the part of the scale of integers covered by the expression—e.g. numbers between 50 and 100—the guess does not happen uniformly. Rather, we intuitively tend to guess an integer around the center of the category, i.e., around 75. In other words, we are less likely to select a number close to the category boundaries, such as 99. As we discuss in more detail below, the situation is subtler when multiple possible utterances are involved.

In this paper, we point to the

Formally, the structural theory of alternatives starts with the idea of a structural alternative. _{str}

(5) | _{str} |

In words, the set of utterances that enter in the calculation of implicatures for

While the original criterion for alternatives in Katzir (

(i) | Every dad_{i}_{i}_{j}_{j} |

Which implicates the negation of:

(ii) | Every dad_{i}_{i} |

(iii) | Every dad_{i}_{i} |

In the syntactic approach to alternatives construction discussed above, (i) might generate the following alternatives:

(iv) | Every dad_{i}_{i} |

(v) | Every dad_{i}_{j} |

While (iv) is a correct prediction (See (ii)), it is not prima facie clear whether (v) is meaningful. Moreover, even under further assumptions that make (v) meaningful it is hard to see how the negation of (iii) can be derived.

The second argument favouring the conceptual account of structural alternatives is that some alternatives might be inexpressible in a language. As an example, consider the English sentence:

(vi) | John broke all of his arms. |

which arguably sounds odd because it is in competition with the sentence:

(vii) | John broke both of his arms. |

However, the French counterpart to (vi) sounds as odd as its English version:

(viii) | Jean c’est cassé tout les bras. |

despite the lack in French of a word for ‘both’. A natural explanation for this is that alternatives are conceptual rather than strictly linguistic, and since the concept of ‘both’ is available even when a word for it is lacking, French speakers derive it as an alternative.

In this paper, we will assume the conceptual rather than the syntactic criterion for alternatives generation. This allows ‘more than half’ (and not only ‘more than

We make three crucial assumptions about the way alternatives are generated for the expressions under consideration. First, not every expression of the form ‘

In what follows, we illustrate the model with the fractions obtained with numbers up to 3 and consider numbers up to 4 when fitting experimental data in Section 7. We chose numbers up to 4 based on previous literature suggesting that they are cognitively simple. First, they are the numbers within the subitizing range, namely that range of numbers that can be evaluated rapidly and confidently (

The second assumption we make is that the quantifiers constructed by substitution satisfy the properties of conservativity, extensionality, and isomorphism-closure invariance discussed in Peters & Westerståhl (

The third assumption we make is that the concepts expressed by ‘most’ and ‘more than half’ are structured in way proposed by Hackl (_{str}_{str}

In this section, we have presented two mechanisms that play a role in the way quantifiers are interpreted. These two mechanisms have already been discussed in the literature in other contexts (

Our model is in the Rational Speech Act (RSA) modelling framework (_{2} who selects a signal that is most useful to a pragmatic listener _{1}. In the standard RSA model, no matter what signal _{1} receives, they always interpret it on the background of a fixed set of alternative signals. In contrast, in our model the utterance produced by _{2} also determines the set of alternatives considered by _{1} in their pragmatic reasoning, in the way described by the conceptual account of alternatives discussed above.

In the case at hand, _{2} selects ‘most’ or ‘more than half’ not simply as a function of their extension on the scale of proportions, but instead also implicitly selecting the set of alternatives that will allow a pragmatic listener to choose a proportion that is as close as possible to the speaker’s observation. Since pragmatic listener _{1} guesses points closer to 0.5 for a rich alternative set such as the one induced by ‘more than half’, the speaker selects ‘more than half’ for such proportions. On the other hand, since the listener will guess points higher on the scale of proportions for ‘most’, the speaker produces ‘most’ for such proportions. As a consequence, the speaker chooses ‘more than half’ to describe points close to 0.5 and ‘most’ for points higher on the scale.

In the previous section, we have informally introduced two mechanisms in the interpretation of quantifiers. In this section, we propose a formal implementation of these two mechanisms in the RSA modelling framework before turning to the specific case of ‘most’ and ‘more than half’ in Section 5.

The RSA framework is meant to model the process of recursive mindreading that lies behind the pragmatic interpretation and production of utterances (See for instance Goodman & Stuhlmüller (

The simplest RSA model starts with a set of utterances _{1} receives an utterance _{1} would have produced the utterance given each state:

(6) |

The pragmatic speaker, in turn, observes a state and produces an utterance that tends to maximize the utility _{0} given the state:

(7) |

The utility

(8) |

Finally, the probability that the literal listener _{0} attributes to each state given an utterance is simply 0 if the state does not verify the utterance, and proportional to the prior for the state otherwise:

(9) |

_{0}, _{1}, and _{1} in this simple RSA model. The crucial phenomenon that can be observed in _{1} calculates a scalar implicature: although utterance _{1} is, in its literal sense, compatible with both _{1} and _{2}, _{1} tends to produce _{1} mostly for _{2}, because when _{1} is observed _{1} tends to use the more useful signal _{1}. Therefore, when hearing _{1} _{1} is more likely to guess _{2}.

(a) Simple RSA model with three possible utterances _{1} calculates a scalar implicature for utterances _{1} and _{2} (_{0}, _{1}, and _{1} respectively. Note that the color indicates the probability of guessing a state given a signal for _{0} and _{1}, and the probability of producing a signal given a state for _{1}. (b) RSA model with a distance-minimizing _{1}. The model displayed in the plot uses a language with three utterances and 20 states. The listener _{1} does not simply guess the signal observed by the speaker by sampling their posterior, but rather attempt to minimize the expected distance between their guess and the speaker’s observation (

In the simple RSA models above, success in communication is binary, solely a function of whether the listener’s guess coincides with the speaker’s observed state. This is plausible in cases where the set of states has no internal structure. However, as discussed above, in the case where a notion of distance is well-defined on the set of states, the listener might not be simply trying to guess the speaker’s observation but rather might strive to minimize the (expected) distance between the state they select and the speaker’s observation.

To model the effects of a well-defined distance _{1} so that instead of selecting a state by sampling from their posterior distribution given the signal, they try to minimize the expected distance between their selection

(10) |

where _{n}, s_{m}_{1} tends to guess points located centrally in the category, after the category has been restricted by scalar implicature.

The modification to the basic RSA model above is an implementation of the first mechanism discussed in Section 3. The second mechanism concerns the way that the comparison set depends on the speaker’s utterance.

In the basic RSA framework, the set of possible utterances considered by the pragmatic speaker and the pragmatic listener are identical. However, according to the structural account of alternatives discussed above, the set of utterances considered by the listener depends on the utterance they receive. For instance, if the speaker utters ‘101’, the listener will consider all alternative utterances at most at a similar level of granularity as 101, such as 91 and 100. However, if the speaker utters ‘100’, the listener in the model considers an alternatives set containing, e.g., only 90 and 100, but not 101.

To model this, we introduce a speaker _{2}. _{2}, much like _{1}, tends to select the signal that minimizes the listener’s surprise for the real state given the signal. However, the set of alternative utterances considered by _{1} is not independent of the signal received by _{1}. Instead, the set of alternative utterances considered by _{1} (and therefore by the lower levels _{1} and _{0}) depends on the utterance _{1} receives. _{2} takes this into account when selecting an utterance, calculating for each utterance the utility of the utterance given the set of alternatives that _{1} considers when receiving that utterance. So, while _{1} does not reason about _{2}, _{2} does not select an utterance alone, but in addition also the set of alternatives that come with the utterance.

Consider now for illustration the case of ‘some’, ‘all’, and ‘some but not all’ discussed above. _{2} as they decide which signal to produce given that they observed a 100% state or a state < 100% (and > 0%). Being a rational speaker, _{2} produces signals that maximize the probability that _{1} attributes to the true state. Since _{1} is a rational listener, the probability attributed to each state given a signal depends on the set of alternatives to that signal. According to the structural account of alternatives, the set of alternatives is a function of the received signal. So if _{2} sends ‘some but not all’ (SBNA), _{1} will run pragmatic reasoning on the set of utterances {‘All’, ‘Some’, SBNA} (bottom row of plots in _{2} utters ‘some’, _{1} will reason only with {‘All’, ‘Some’} according to the structural account of alternatives, and therefore calculate the implicature from ‘some’ to SBNA (top row of plots in _{2} will tend to produce the utterance that is most useful for a hypothetical _{1} who reasons about a set of alternatives which itself depends on _{2}’s utterance.

Structural account of alternatives with the simple example of ‘all’, ‘some’, and ‘some but not all’ (SBNA). Since 0% would be black in all plots, it is implicitly excluded from the scale for ease of visualization. Lighter colors indicate higher probability. After receiving an utterance u, _{1} constructs a set of conceptual alternatives specific to u. For instance, upon hearing ‘all’, _{1} runs pragmatic inference with the set of conceptual alternatives {all, some}, which does not include ‘some but not all’. Therefore, for each utterance _{2} calculates the utility of _{1} relativized to u’s comparison set as calculated by _{1}, rather than considering a fixed set of alternatives. For visualization purposes, ‘all’ and ‘some’ as considered by _{2} are represented together in the top row as they share the same set of alternatives.

This picture of alternatives is, in many respects, a simplification. For instance, it is likely that the listener is uncertain about which set of alternatives ought to be considered in the context. More complex discussions of issues related to granularity and alternatives can be found in the literature, see e.g. Bastiaanse (

In sum, the model presented in this section will apply whenever (1) the listener is trying to minimise the distance between their guess and the speaker’s observation, and (2) different terms induce different sets of alternatives. Crucially, the model applies even if two expressions with different sets of alternatives are truth-conditionally equivalent.

In this section, we have formalized the two mechanisms discussed in Section 3 within the RSA framework. The resulting model is summarized in natural language in

Structure of RSA model with distance-minimizing listener and structural account of conceptual alternatives. The set of alternative utterances considered by _{1} is not fixed but depends on the received utterance. Moreover, _{0} and _{1} do not simply guess a state based on their posterior probability given the received signal but rather tend to guess a state that is expected to be close to the speaker’s observation.

In what follows, we will model communication with quantifiers by applying the RSA model described above to the following simple referential communication task, modelled after Pezzelle et al. (

The set of possible meanings includes Aristotelian quantifiers ‘all’, ‘none’, and ‘some’, and some minimal set of alternatives for ‘more than half’ (

Meaning of each signal in the model. MT=’more than’, LT=’less than’.

Utterance | Structure | Extension |
---|---|---|

All ( |
| |
{1} |

Most | | |
(1/2, 1] |

None (¬∃) | | |
{0} |

Some (∃) | | |
(0, 1] |

MT a half (> 1/2) | | |
(1/2, 1] |

MT one third (> 1/3) | | |
(1/3, 1] |

MT two thirds (> 2/3) | | |
(2/3, 1] |

LT a half (< 1/2) | | |
[0, 1/2) |

LT one third (< 1/3) | | |
[0, 1/3) |

LT two thirds (< 2/3) | | |
[0, 2/3) |

We make two additional assumptions about the set of meanings. First, we exclude the meaning expressed by ‘not all’, namely |

As in the modified RSA model presented above, the alternatives considered by the pragmatic listener depend on the speaker’s utterance. For instance, if the speaker uttered ‘some’ the listener would consider a set of alternatives containing ‘all’ but not ‘more than two thirds’, while if the speaker uttered ‘more than one third’ both ‘all’ and ‘more than two thirds’ would be possible options for the listener. In the present case, the utterances above can be divided in two groups, the first containing ‘all’, ‘most’, ‘none’, and ‘some’, and the second containing the remaining utterances. Each utterance in the first group contains all other utterances in that group as alternatives, and none of the utterances in the second group. Each of the utterances in the second group contains all utterances in its set of alternatives.

To isolate the effects of the account of alternatives discussed above from the consequences of utterance cost, we assume that signals have no cost. Moreover, to keep the results as simple as possible _{2} can only produce ‘all’, ‘most’, ‘none’, ‘some’, ‘more than a half’, and ‘less than a half’, rather than the full set of alternatives in

The results of the model are shown in _{0} guesses uniformly within the categories expressed by each signal considered by _{2}. _{0} treats ‘most’ and ‘more than half’ identically, guessing uniformly among the states between 51 and 100. Finally, _{0} selects the maximum for ‘every’ and the minimum for ‘none’.

(a) The plots shows the results with |_{0}, _{1}, and _{1} are shown on the same plot, they implicitly have different comparison sets. (b) The plots shows the results with |_{0} is not shown as it is identical to

With _{1}, the set of alternatives for each signal matters (second plot from top in _{1}, their upper bounds are different as a consequence of the different ways that the respective set of alternatives cover the scale. ‘More than half’ implicates less than two thirds, and therefore tends not to be used for proportions higher than two thirds, while most only implicates ‘not all’. Note that while the six signals are plotted together in _{1} does not suffice to explain the difference between ‘most’ and ‘more than half’.

_{1} tends to pick the central point in the categories as produced by _{1} (third plot from the top in _{1} tends to guess points closer to the middle of the scale for ‘more than half’ than for ‘most’, because the former is produced by _{1} for a range of proportions closer to the scale’s midpoint. Finally, the pragmatic speaker _{2} tends to pick ‘more than half’ for signals closer to the midpoint of the scale than ‘most’ (bottom plot in

The results in

In the previous sections, we explained the difference in the typical proportions conveyed by ‘most’ and ‘more than half’. We implemented the proposed account in an RSA production model and showed that this model can qualitatively produce the observed effect. This section presents the results of a quantifier production experiment and analyses how well the RSA can fit them quantitatively.

The experiment is based on the ‘grounded task’ in Pezzelle et al. (

Each participant completed 340 rounds, each round consisting of three screens. The first screen, which lasted 500ms, only contained a fixation cross. The second screen, which lasted one second, showed objects arranged in a grid, with possibly empty slots. The objects were a mixture of one type of animal and one type of artifact, the exact types varying across pictures. Each image contained between 3 and 20 (inclusive) objects. Finally, the third screen showed a grid of nine quantifiers: ‘most’, ‘more than half’, ‘all’, ‘half’, ‘many’, ‘none’, ‘less than half’, ‘few’, ‘some’ (the choice of quantifiers is the only difference in design to Pezzelle et al. (

Descriptive statistics for each signal in the experiment. This table matches

(a) resp | (b) % targ | (c) n targ | (d) n non-targ | (e) n total | |
---|---|---|---|---|---|

None | 1186 | 0.06 (0.2) | 0.65 (2.53) | 10.88 (5.27) | 11.53 (4.92) |

Few | 3784 | 0.23 (0.14) | 2.41 (1.75) | 9.2 (5.0) | 11.62 (5.46) |

Less than half | 2647 | 0.36 (0.13) | 4.2 (2.35) | 7.72 (4.0) | 11.92 (5.12) |

Some | 1459 | 0.37 (0.19) | 4.56 (2.79) | 8.5 (4.56) | 13.06 (5.07) |

Half | 1421 | 0.49 (0.1) | 5.69 (2.7) | 5.97 (2.92) | 11.67 (4.96) |

More than half | 2762 | 0.63 (0.13) | 7.27 (3.82) | 4.13 (2.48) | 11.4 (5.0) |

Many | 1449 | 0.7 (0.17) | 9.82 (4.24) | 4.22 (2.87) | 14.04 (4.66) |

Most | 3517 | 0.77 (0.14) | 9.67 (4.81) | 2.81 (2.28) | 12.48 (5.43) |

All | 1155 | 0.95 (0.19) | 10.92 (5.33) | 0.6 (2.6) | 11.52 (5.02) |

Kernel density estimation of data aggregated across participants for various subsets of the data. (a) All data. (b) Data where both the number of targets (animals) in the picture nor the number of non-targets (artifacts) was greater than 3. (c) Data with 3 or fewer target objects (animals). Y-axis for ‘None’ is scaled down for visualization purposes. (d) Data with 3 or fewer non-target objects (artifacts).

The quantifiers which were not in Pezzelle et al. (

Graded van Benthem triangle. The plot shows, for each combination of total number of objects (targets + non-targets) and number of target objects, the counts of usages of each quantifier aggregated across participants. The completely white squares are combinations that none of the participants saw. The black squares are observations for which the quantifier was never produced, and squares become lighter with increasing number of uses. A specific proportion can be imagined as a straight line starting from (0,0) (outside of each plot). The bottom right plot is an illustration of the proportion 0.5. As expected for proportional quantifiers, the upper and lower bounds for each quantifier roughly follow straight lines with various inclinations.

The crucial result is that the difference between ‘most’ and ‘more than half’ observed by Solt (

In the model-based approach to statistics we use in this section, a model of the situation is given that, based on unobserved parameter values and prior distributions over them, predicts the probability of each possible observation. Bayes theorem, plus various approximation algorithms, is then used to go from the observed data to a posterior distribution over the unobserved variables. Multiple models can then be compared in terms of how well they are supported by the data.

In our case, each data point consists of three observations: the total number of objects seen by the participant, the number of animals, and the quantifier chosen by the participant. We assume that the participant chooses a quantifier based on the RSA described in Section 5, extended with the signals in

Meaning of signals added to the ones in

Utterance | Structure | Extension |
---|---|---|

Half (1/2) | | |
{0.5} (but see Appendix A) |

Many | | |
(0.4, 1] (with |

Few (¬∃) | | |
[0, 0.2) (with |

MT one fourth (> 1/4) | | |
(1/4, 1] |

MT three fourths (> 3/4) | | |
(3/4, 1] |

LT one fourth (< 1/4) | | |
[0, 1/4) |

LT three fourths (< 3/4) | | |
[0, 3/4) |

We compare two such RSA models, with and without the structural account of conceptual alternatives (

DAG for the Bayesian model. See Section 7.2 for definitions of functions RSA and _{E}

The production model presented above consisted of an RSA model with 10 signals, 6 of which could be produced by the speaker and 4 of which were only implicitly considered as alternatives. In order to use the RSA model developed in Section 4 to fit the experimental data, the language has to be slightly enriched to include the signals in the experiment. The signals included in the model are the ones in

While in the model the participants only consider producing the available signals, they imagine a listener who considers a broader range of signals, including some that the participants themselves cannot produce. We assume that the participant is not assuming that the listener knows what signals are available for the participant to produce. For instance, in a similar experiment that only contained ‘some’ and ‘none’, we can imagine that the participant would produce ‘some’ but feel uneasy about the choice upon observing a screen where all the objects were animals. This assumption may fail, but then Solt (

Based on the discussion of conceptual alternatives above, we retain two groups of alternatives as follows: ‘All’, ‘Most’, ‘None’, ‘Some’, ‘Half’, ‘Many’, and ‘Few’ are all alternatives of each other, and do not have any of the other signals as alternatives. The remaining signals are all alternatives of each other and of the signals in the first group. For instance, using the notation of Katzir (

The semantics of quantity words ‘many’ and ‘few’ is currently being debated (

In addition to specifying the alternatives set for each quantifier, the model requires a specification of their literal meanings. Most of the quantifiers we included in the experiment have a default interpretation in the literature, and we have discussed the case of ‘few’ and ‘many’. The meaning of ‘half’ is also made non-trivial by the discreteness of the stimuli. We discuss ‘half’ more in detail in Appendix A.

Up to this point, we have considered the production behaviour of a rational RSA agent. However, behaviour of real participants will not perfectly conform to the RSA model. First, there will be systematic error from aspects of quantifier usage that the RSA model does not capture. Second, there will be noise coming from participants pressing the wrong button or not paying attention. To account mainly for the latter kind of error, we add production noise in the model. The production noise introduces a third production noise parameter

where _{E}_{p}_{p}_{s}_{s}_{p}, ρ_{p}, a_{s}, b_{s}_{E}_{p}, ρ_{p}, a_{s}, b_{s}_{p}

The hierarchical model (displayed as a Bayesian directed acyclic graph in _{k,j}_{E}_{k}, ρ_{k}, a_{S[k,j]}, b_{S[k,j]}_{k}

The middle level of the hierarchy is the level of the individual participants _{k}_{k}_{k}

The top-level is the population level, from which the participant-level parameters are sampled. The top-level contains six distributions: a distribution over the (1)

Overall, the generative model is as follows. First, an

The prior values for the population-level distributions can be seen in

Prior predictive simulations. (a) Marginal distribution of prior samples for various model parameters. (b) 95% HPD interval for _{2}’s behaviour for each signal. (c) 95% HPD interval for predicted participant behaviour, i.e. RSA agent with added noise. The main effect of adding noise is an increased probability of producing signals outside of the usual range of usage of the quantifier. Note that prior predictions of noisy production behaviour includes nearly uniform languages, which appear to be close to 0. because uniform production probabilities are small for each state. Therefore, despite (b) and (c) looking similar, they include substantially different predicted production behaviours. (d) and (e) plot the same information as (b) and (c) respectively, for the model without the structural account of conceptual information. Crucially, in (d) and (e) ‘most’ and ‘more than half’ are used identically.

Comparison of prior and posterior marginal distributions for the two models. For the posterior distributions, only the 95% HPD interval is shown.

In addition to the model with the structural account of conceptual alternatives described above, we fit a model without the structural account of conceptual alternatives, where each signal has all and only the other signals seen by the participants as alternatives. The 95% HPD intervals for the marginalized prior production probabilities are shown in

It is worth noting the way that the noise mechanism affects the estimation for both models. If the noise parameter for a participant is high, the participant’s behaviour will depend less on their

This section described how we embed the RSA model within a hierarchical Bayesian model whose hidden parameters can be fitted to experimental data. We discussed two models which can be compared, with and without the structural account of alternatives. In the next section, we present the results of fitting for the two models and their comparison.

We fit the models with the Python library PyMC3 (

For both of the models, the posterior distributions of all the population-level parameters are substantially more precise than the prior distributions, indicating that the data contained substantial information about the underlying parameters (

We compare the models with and without the structural account of alternatives with the Watanabe-Akaike Information Criterion (WAIC) (

Comparison of the model with both mechanisms (‘Both’) and the model without structural alternatives (‘W/o sa’) with the WAIC. The plot is on the deviance scale, where smaller values indicate a better fit. The empty circles are the WAIC values for the two models, with their associated standard deviations shown as black bars. The black dots show the in-sample deviance of each model. Finally, the triangle and its error bar show the standard error of the difference between the WAIC of the top ranked model, i.e. the one with both mechanisms, and the WAIC of the model without structural alternatives. The main result in the plot is that the model with both mechanisms performs much better than the model without the structural account of alternatives.

A crucial question about the two models is how well they predict the participants’ behaviour with respect to each of the signals.

Distribution of differences of posterior signal-wise deviance, marginalized across participants. Positive values indicate better in-sample fit for the model with both mechanisms. The model with the structural account of alternatives performed better for all signals except ‘Many’ and ‘Half’.

Posterior predictive simulations (without production error) for a new participant, for 20 total number of objects. We predict a new participant in this plot by sampling a set of individual-level parameters from the population-level distributions defined by each posterior sample and then calculating _{2}. The left column of plots shows the 95% HPD interval of production probabilities, for the model with both mechanisms and without the structural account of alternatives. Adding production error does not make a substantial visual difference in the plots, as the predicted production error is generally low. The right plots show the distribution of the means of the population-level distributions for the plots on the right.

This section presented two Bayesian hierarchical models encoding two minimally different pictures of how quantifiers are produced, with and without the structural account of alternatives. In sum, model comparison lends strong support to the model which includes the structural account of conceptual alternatives over a minimally different model without it. Moreover, the model with structural alternatives has a closer fit to the data not only for the signals we have discussed—’most’ and ‘more than half’—but also for most of the other signals, a consequence of the fact that in an RSA model the production distributions for the signals are interdependent.

As discussed above, Solt’s account is concerned with a wider variety of phenomena than ours. For what concerns the interpretation of and differences between ‘most’ and ‘more than half’, our and Solt’s accounts have some points of overlap and some important differences. The main point of overlap is that the difference in upper bounds between ‘most’ and ‘more than half’ is explained in both accounts in terms of a difference in the set of alternatives: while ‘more than half’ competes with ‘more two thirds’ and similar utterances, ‘most’ does not. The reason for this difference in the sets of alternatives is different in the two accounts, namely the structural account of alternatives in our model and a difference in the types of scales in Solt’s account. The explanation for the fact that ‘more than half’ has a lower bound closer to 0.5 than ‘most’ is also different in the two models. While in our model it is due to ‘most’ pragmatically competing with an enriched sense of ‘more than half’, in Solt’s account it follows again from a difference in the scales underlying the two expressions, as explained above in Section 2.

In the previous section, we discussed and compared two RSA models with respect to their ability to fit the experimental data, namely a model with and without the structural account of alternatives. It would have been desirable to directly compare the two models with the account proposed in Solt (

More specifically, in Solt (

This problem was reduced by Solt (

While the ANS plays a crucial role in Solt (

For stimuli within the subitizing range, e.g., those stimuli where ‘None’ applies, the ANS predicts that participants would only produce signals that literally applied to the stimulus. In contrast to the ANS, the distance-minimizing mechanism applies to every proportion and predicts correctly that participants sometimes use ‘All’ and ‘None’ for non-extreme, albeit close to extreme, proportions. To substitute the distance-minimizing listeners with the ANS, an additional mechanism could be added to the generative model so that participants might miss some of the target stimuli when many non-target stimuli are shown (or vice versa), allowing e.g., for perceptual confusion between 20/20 (target/total) and 19/20. This would allow for misapplication of the literal meaning within the subitizing range while however further complicating the model. In conclusion, while the distance-minimizing listeners do not play as crucial a role as the structural account of alternatives and alternatives could be developed for the former, we leave such developments for future work.

Another difference between our and Solt (_{1} and _{1}; second, the competition of each quantifier with the other real alternatives at the level of _{2}. While implicit alternatives like ‘more than 3/4’ might lower the upper bound of ‘more than half’ compared to ‘most’, if ‘most’ was not an available option, ‘more than half’ would be used by _{2} for higher proportions for lack of a better signal. Our model, therefore, implies that the two expressions pragmatically compete with each other: ‘more than half’ is upper bounded by the lower bound of ‘most’, and vice versa. Since there is no equivalent of ‘most’ for points below 0.5, a prima facie consequence of our account is that ‘more than half’ and ‘less than half’ should not be symmetric. However, the situation is further complicated by ‘some’ competing with ‘less than half’ at the level of _{2} in a way similar to how ‘more than half’ competes with ‘most’. As can be seen in

Overall, our model offers various advantages over Solt’s account for explaining the proportions for which ‘most’ and ‘more than half’ are used. First, it is a quantitative rather than a verbal model and can be used directly to analyse or predict behaviour as we have done with experimental data above. As discussed, it would not be trivial to expand Solt’s model to make the quantitative predictions that our model makes.

Second, our model offers a unified explanation for bounds of both ‘most’ and ‘more than half’, in contrast to Solt’s disjunctive account. It is, of course, debatable how rich our assumptions are compared to Solt’s. However, it is worth noting that it is easy to overestimate the number of assumptions we make compared to Solt’s because our model makes precise quantitative predictions, which requires accurate assumptions. If Solt’s account were to be made precise enough to generate the kind of quantitative predictions our model generates, further assumptions would need to be introduced, among others: a functional form for the ANS approximation, a decision threshold for considering one point greater than another on semi-ordered scales, a specific set of alternatives for ‘more than half’, a precise mechanism for computing scalar implicatures.

Third, our model is not tailored specifically for the opposition between ‘most’ and ‘more than half’, but can rather fit quantifier production behaviour more generally, as demonstrated above. While no empirical observation is presented which would show that one theory makes the right prediction while the other makes the wrong one, this is partially because Solt’s account makes no clear prediction for most of the data we gathered, since it does not say anything about the way the other quantifiers we analyse interact with ‘most’ and ‘more than half’. Note that this point is different from the fact that our model is quantitative while Solt’s model is qualitative: even if Solt’s model was made specific enough to make quantitative predictions, in its current form it would still only concern ‘most’ and ‘more than half’.

Despite these advantages of our model, we have discussed only one of the features analysed by Solt (

While traditionally assumed to be truth conditionally equivalent, ‘most’ and ‘more than half’ are typically associated with different proportions. In the most developed explanation of this difference, Solt (

The RSA model we presented can be extended in various possible directions. First, a similar model could be used to account for the usage of modified numerals, since a similar contrast to the one discussed here can be found e.g., between ‘more than 100’ and ‘more than 101’, where the typical guessed number for the former utterance is higher than for the latter. More in general, the model presented in this paper applies whenever two synonymous expressions behave pragmatically differently due to their conceptual structure.

Second, in the models we presented we only considered a small set of possible utterances. However, it would be valuable to study the model’s predictions when more alternative utterances are included. One possibility would be to add utterances containing more fine-grained proportions, which would induce a more granular set of alternatives for ‘more than half’. Moreover, Denić & Szymanik (

Third, the hierarchical Bayesian model can also be extended in various ways, e.g., by implementing more alternative accounts of the difference between ‘most’ and ‘more than half’ and comparing them to our account. We leave all these exciting possible developments to future work.

In general, the linguistic consequences of adding the structural account of conceptual alternatives to RSA modelling are those that have been identified by Buccola et al. (

The anonymized data, as well as the code for model fitting in PyMC3 and for analyzing the traces can be found in the following GitHub repository:

The study was approved by the European Research Council and the University of Amsterdam, Faculty of Humanities Ethics Committee. Informed consent was obtained for each participant prior to the experiment.

In what follows, we assume that ‘A’ and ‘B’ refer to two finite sets

This type of semi-ordered structure has been proposed as a cognitive model of perception of quantities (

There are features for which two individuals might be more or less close to each other, but this does not concern their identity

This communicative setup is called

In what follows, we will focus on the scale of proportions, but what we say can be easily generalized to the scale of integers.

Katzirian alternatives can also be used if ‘more than one half’ is considered instead of ‘more than half’.

For a discussion of the role of granularity in scalar language, see e.g., Cummins et al. (

The lexicalization of the fractional concept 1/2 with the omission of ‘one’ in ‘one half’ has a pragmatic justification: ‘one’ (or ‘a’) is generally superfluous when combined with ‘half’, since except in very rare occasions ‘two halves’ would not be uttered, given the simpler available option ‘one’. This is opposed to every other denominator, which can informatively combine with more than one numerator in a way that is not reducible to other fractions.

Cf previous work where the effects of a distance structure affect the speaker but not the listener, such as Franke (

We apply this modification only to _{1}, assuming that the attempt to minimize distance is something above and beyond the literal reading of the signals. We leave to future work an investigation of the effects of modifying both listeners.

It would be interesting to explore the consequences of lifting this assumptions. We leave this to future work.

Previous work has argued that utterances with different monotonicity profiles do not appear in the same set of alternatives (e.g.

We would like to thank Sonia Ramotowska and Simone Astarita for help with collecting the data, and Milica Denić, Sandro Pezzelle, and the audiences of Meaning, Language, and Cognition seminar in Amsterdam and the Society for Computation in Linguistics for discussions. Moreover, we would like the thank anonymous reviewers for their valuable feedback.

The authors have received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP/20072013)/ERC Grant Agreement n. STG 716230 CoSaQ.

The authors have no competing interests to declare.