1 Background

The numeric symbol “0” has two distinct guises. First of all, it is a place holder in the representation of large numbers. For example, in “103”, it indicates that there is nothing in the second column, helping us to distinguish “103” from “13”, “1030”, etc. The other use of “0” is as a number itself, the number between “–1” and “1”. Although quite a few ancient civilisations (but famously not the Ancient Egyptian, Greek and Roman ones) had positional number systems with punctuation marks that act like place-holder “0”, the concept of “0” as a number itself only first arose in the seventh century CE, in India. This concept of a “0” number then very slowly spread worldwide. For instance, only by the 17th century, the concept had become common in Europe (e.g. Kaplan 1999).

Compared to other numbers, “0” is thus a relative latecomer. Naturally, this means that the word for the number “0” is similarly a relatively recent addition to the language families of the world. Given its late arrival, we can obviously do very well with a language that lacks a word like “zero”. One would expect that the only thing that changes once such a word is included, is that we can then express the newly formed numerical concept. It allows us to state scientific generalisations dependent on that concept. Newton’s first law of motion (the law of inertia), for instance, could now be stated as in (1).

(1) The acceleration of an object is zero if and only if the resultant force on the object is zero too.

In non-mathematical contexts, it may seem that “zero” brings nothing new to the language. It amounts to simply another way of saying “no”. Take (2):

(2) There are zero emails in my inbox.

Here, “zero” indicates the number of emails in the speaker’s inbox. But, of course, saying that this number is 0 is no different from saying that there are “no” such emails. In non-scientific natural language, it may seem, there is little use for a word for “zero”, given the linguistic possibility of combining existential claims with negation to express the absence of stuff. Perhaps then, “zero” in examples like (2) is a synonymous – maybe, more emphatic – alternative to “no”. In this paper, we will argue that such a characterisation is wrong. “Zero” and “no” are semantically and pragmatically distinct. In particular, we will argue that words for “0” are proper numerals, just like “one” and “thirteen”. Our conclusion will be that the meaning of “zero” is, quite simply, “0”, even in cases like (2). Additionally, we will show that the existence of a zero numeral has profound consequences for linguistic semantics. We will ultimately conclude that the fact that languages allow ascription of zero quantity to an entity provides evidence that linguistic semantics has access to what at first sight may seem like an ontological oddity: an entity with zero quantity. In other words, we will show that studying “zero” can inform us about the underlying semantic ontology of natural language.1

This work is structured as follows. In section 2, we will provide arguments against a quantifier analysis of “zero”. We will conclude that “zero” is a numeral and provide a detailed semantic analysis in sections 3 and 4. In particular, we will give an analysis of the inability of “zero” to license negative polarity items. Section 5 is a discussion of the wider semantic consequences of our analysis, in particular of our proposal to allow for the existence of a zero quantity entity. We conclude by examining a potential alternative approach.

2 “Zero” is not a quantifier

While it is clear that “zero” at times refers to the numerical concept “0”, as in “division by zero”, it is tempting to think that there is a distinct second use of “zero” which is purely quantificational. That is “zero emails” is simply synonymous to “no emails”, albeit perhaps a bit more emphatic due to the choice of the more marked (less frequent) “zero” over “no”.

It would be relatively straightforward to account for “zero” in this way if we think of it as a generalised quantifier: a relation between sets. Whereas “no A B” expresses that the intersection between A and B is empty, “zero A B” equivalently expresses that the intersection between A and B has cardinality 0. Given such an account, we would not expect to find any non-pragmatic differences. As we will show next, however, those predictions are wrong.

2.1 Distributional differences

If “no” and “zero” are equivalent generalised quantifiers, then we would expect them to have a similar distribution. It turns out, however, that in many respects, “zero” behaves distributionally like a numeral, not like a quantifier. A first indication of this is that, like other numerals, it allows NP ellipsis. In contrast, “no” disallows such ellipsis. Only the full DP “none” can be anaphoric to cars in (3).

(3) a. John owns four cars. Bill owns zero (*ones).
  b. John owns four cars. Bill owns thirteen (*ones).
  c. John owns four cars. Bill owns *no/none.

According to our native speaker informants, measure nouns like “litre” and “metre” combine with numerals, and also with “zero”. Yet, they are infelicitous with “no”:

(4) There are { zero/thirteen/??no } litres of milk in the fridge.

Ratio expressions like “DP per N” only allow a very specific class of DPs, especially numeral ones: “they sold { a hundred/*all } tickets per week”. Our informants tell us that, here too, “zero” pairs with numerals and not with “no”. For instance:

(5) This drink contains { zero/thirteen/*no } grams of sugar per bottle.

Another environment that is suitable for numerals is multiplicatives, as in (6).

(6) John visited his grandmother { zero/thirteen/??no } times.

In terms of distribution, then, “zero” appears to behave like a normal numeral. Although we take this to be suggestive of a non-quantificational nature to “zero”, it could of course in principle be that, although “zero” and “no” are semantically equivalent, they differ starkly in their syntactic requirements. That is, in order to show that a quantificational analysis is on the wrong track, we will have to show that “zero” and “no” differ in important semantic respects. As we will show next, they do.

2.2 Polarity and NPI licensing

“Zero” and “no” are obviously negative expressions. They differ, however, with respect to the nature of their negativity. In subject position, negative quantifiers trigger positive tag questions, just like sentential negation does, as shown in the following contrasts.

(7) John doesn’t love her, does he/*doesn’t he?

(8) John loves her, doesn’t he/*does he?

(9) No students have read my book, have they/*haven’t they?

(10) Most students have read my book, *have they/haven’t they?

“Zero”, however, pairs with positive quantifiers like “most”, not with “no”, as was previously noticed by De Clercq (2011), who gives (11) as an example.2

(11) Zero people love her, *do they/don’t they?

This datum suggests that, in subject position, “zero” contributes a different semantics of negation than “no”. This is confirmed by the inability of “zero” to license NPIs. In (13), we show this for the strong NPI “in years”.

(12) No student has visited me in years.

(13) *Zero students have visited me in years.

The judgements reported in the literature regarding weak NPIs are not totally consistent. However, what is consistently reported is that there is a clear contrast between “no” and “zero” in licensing weak NPIs. For instance, Gajewski (2011) reports the contrast between (14) and (15), giving (15) a question mark.

(14) No student ever said anything.

(15) ?Zero students said anything.

Zeijlstra (2007) thinks the contrast is clearer and gives (16), a totally parallel example, a star:

(16) *Zero students bought any car.

If both “no” and “zero” are equivalent negative quantifiers, it is hard to see how their licensing of tag questions and NPIs can be so different.

In summary, the negative force of “zero” is weaker than that of “no” in the sense that it does not license sentence negation phenomena like positive tag questions and NPIs.3 We take this fact to be sufficient to dismiss an analysis of “zero” as a negative quantifier. In the remainder of this article, we will show that these properties follow once we assume that the semantics of “zero” is parallel to that of (other) numerals.

3 Numeral semantics

Numerals are not quantificational determiners in the classical generalised quantifier sense. In particular, it is often remarked that numerals lack quantificational force, as evidenced by minimal pairs like (17) and (18), from Link (1987).

(17) Three men lifted the piano.

(18) Three men can lift the piano.

While (17) has existential force, (18) is a generic statement about the lifting capacities of groups of three men. The semantics for the numeral “three”, then, should be void of existential force, since that existential force must come from the particular environment that is present in (17) and absent in (18).

Most contemporary semantic analyses of numerals assume that numerals only have indirect quantificational force. We will ultimately propose a semantics of “zero” that is entirely parallel to such existing proposals for other numerals. Before we can do this, we will need to be very clear about the underlying assumptions that exist in the literature.

3.1 A standard account

We start by discussing two prominent non-quantificational approaches of numeral semantics.

3.1.1 The modificational view

On the first analysis, which could be called the modificational view, numerals are like intersective adjectives, they denote properties (for instance, Rothstein 2016). On this view, the semantics of (17) and (18) first of all concerns the intersection between three properties: being men, lifting or being able to lift the piano and being 3. This means that at their core they involve the open propositions in (19) and (20), respectively.

(19) #x = 3 ∧ *man(x) ∧ *lift-the-piano(x)

(20) #x = 3 ∧ *man(x) ∧ ◊ *lift-the-piano(x)

Here, the cardinality information is given as a property of the variable x: #x = 3 says that x is a group consisting of three atomic entities. If x has this property of being 3-many, then x must obviously be a complex entity, a so-called plural individual. This is why the predicates man and lift-the-piano are lifted to properties of plurals, using the * operator. (See below).

For the respective existential and generic force, external quantifiers are introduced. In the existential case it yields (21).

(21) x[#x = 3 ∧ *man(x) ∧ *lift-the-piano(x)]

We will not have much to say about how this quantificational force comes about, and in what follows, we ignore generic uses completely.

Importantly, (20) represents an at least reading for the numeral: it says that for some group of three it is the case that this group consists of piano-lifting men, but it does not exclude the possibility that more men lifted the piano. This is a welcome prediction, since the distributive reading of “three men lifted the piano” is consistent with more men doing so, as evidenced by the contrast between (22) and (23):4

(22) Three men lifted the piano, if not more.

(23) Exactly three men lifted the piano, #if not more.

This is not to say that “three” is not regularly interpreted to mean exactly three. One would have to posit some additional mechanism, like that of exhaustification (see below), to strengthen (21) into such a reading.

The modificational view derives the meaning in (21) by analysing numerals as denoting sets of equally sized individuals. For instance:

(24) 〚three〛 = λx.#x = 3

This says that “three” denotes the set of all complex entities that consist of exactly 3 atoms. Just like the combination of a noun and an (intersective) adjective like “yellow” is interpreted as the intersection between their two extensions, numeral-noun combinations are interpreted via intersection, too.

(25) 〚three men〛 = λx.#x = 3 ∧ *man(x)

According to (25), “three men” is a set denoting indefinite. It is subject to whatever future compositional operations other indefinites are subject too, like gaining existential force from elsewhere.

3.1.2 The degree type view

On a closely related view, numerals do not express properties, but they rather directly express cardinalities (Hackl 2000). For instance:

(26) 〚three〛 = 3

The connection between the noun phrase and the numeral is mediated by a silent counting quantifier “MANY”. We will follow Hackl (2000) and many others in assuming that MANY simultaneously allows the numeral to express the cardinality of something and introduces existential force.5

(27) MANY〛 = λn.λA.λB.∃x[#x = n ∧ *A(x) ∧ *B(x)]

This says that silent MANY expresses a determiner that takes a number and two properties and returns the proposition that there exists a group of that number that has both properties. For instance, “three men lifted the piano” is analysed as [[[three MANY] men] lifted the piano], resulting in:

(28) x[#x = 3 ∧ *man(x) ∧ *lift-the-piano(x)]

This result is exactly the same as what we arrived at via the modificational account.

3.2 Numerals and plurality

Because in these views the cardinality information provided by a numeral is thought to be the property of an entity (as in #x = n), the semantics of numerals is intrinsically related to semantic plurality. Clearly, singular entities cannot have the property “thirteen”, only plural entities, i.e. so-called pluralities, can. Because our argument will be that “zero” is important to semantic plurality, it is worthwhile to be a bit more specific about the assumptions on semantic plurality.6

Pluralities are made of atoms. In particular, if a and b are two atomic entities, then there exists a plural entity ab, the plurality that consists of nothing but a and b, or, the sum of a and b. In general, for any set of entities X, there exists an entity ⊔X whose parts are the elements of X as well as their parts, while nothing else is part of that individual. So, ⊔{john, mary} is john ⊔ mary and ⊔{john ⊔ mary, sue, ann} is john ⊔ mary ⊔ sue ⊔ ann.

Following Link (1983), we write * for the operator that allows us to map any set to the set of all pluralities that can be built from the elements of this set.

(29) *Z = {⊔X | XZ & X ≠ ∅}.

This operation is illustrated in Figure 1. The arcs between the nodes correspond to inclusion, ⊏, when read from bottom to top. So, aab, and ababc.7

Figure 1 

The set of pluralities *{a, b, c, d}.

The operation * is essential to explain how predicates that are incompatible with group action can nevertheless combine with plural arguments. For instance, breathing is a property of atoms only and so, one would expect that its extension only contains atoms, this in contrast to collective predicates like “to meet”. However, if “to breathe” is true only of atoms, how come we can truthfully state that “John and Mary are breathing”? The answer is that in this case the predicate is pluralised using *. If B is the set of breathing atoms, and if that set includes both John and Mary, then *B will include john ⊔ mary. “John and Mary are breathing” is true if and only if the plurality john ⊔ mary is indeed a member of *B.

Once we have plural individuals in this way, we can also express numerical properties. In Figure 1, there are four layers. The bottom layer is the layer of atoms, entities of cardinality 1. The layer above that has the pluralities of cardinality 2, and so forth.

The definition we provided for * is the one commonly assumed in the literature. It explicitly excludes the sum of the empty set. Formally, this makes the resulting structure a so-called semi-lattice.

The sum of the empty set is an individual that has no proper parts. It is the bottom element in a full lattice.8 We will write this element as ⊥ and, so, ⊔∅ = ⊥. Correspondingly, we could have suggested a way of forming pluralities from a set Z that includes the sum of the empty subset of Z. We will write this operation as ×:

(30) ×Z = {⊔X | XZ}

This operation yields the full lattice in Figure 2, which is exactly the same as the semi-lattice in Figure 1 except that it has this extra element, which turns out to be a proper part of any other entity in the structure. Crucially for our argument, while the semi-lattice in Figure 1 only allows us to express cardinalities of 1 or more, the full lattice of Figure 2 contains an entity of cardinality 0 as well.

Figure 2 

The set of pluralities *{a, b, c, d}.

We are not the first to contemplate adding the bottom element to the domain of entities. Landman (2011) and Buccola & Spector (2016) have entertained and discussed a similar move. For most scholars, however, the choice between a semi-lattice domain and one in the shape of a full lattice is purely cosmetic: since the bottom element has no obvious use, it is easier to do without it. As Landman (1991) explains: “In a full [lattice] we have this 0 element. In all the important definitions, we want to apply our concepts to singular or plural individuals, excluding 0. This means that we have to add exclusion clauses (x ≠ 0) to all of them. Assuming from the start that 0 is not there, will make the definitions simpler and more readable.” (Landman 1991: 302).

A straightforward illustration of the consequences of including ⊥ comes from bare plurals. Intuitively, a sentence like (31) should receive an analysis along the lines of (32).

(31) There are typos in the text.

(32) x[*typo(x) ∧ *in-the-text(x)]

This simply says that the text is not without typos. But what would happen if we applied × instead of *, as in (33)?

(33) x[×typo(x) ∧ ×in-the-text(x)]

It turns out that this is a tautology. The reason for this is that for any predicate P, it is true that ×P(⊥). This is because the empty set is a subset of the extension of P no matter what that extension is. Consequently, ⊥, the sum of ∅, is a member of the extension of ×P, no matter what P denotes. It follows that there will always be an x such that ×typo(x) ∧ ×in-the-text(x), just take x = ⊥.

3.3 Numeral “zero”

If we want to maintain that “zero” is a numeral, a significant dilemma emerges. On the one hand, if we maintain the dominant choice in the literature of adopting *, we have no hope of doing justice to “zero”, since it expresses 0 cardinality, a concept that is not defined in the semi-lattice obtained by applying *. On the other hand, if we go against the grain and adopt ×, then it turns out that sentences with “zero” inherit the problem we just observed for bare plurals: we wrongly predict them to be tautological (and, as we will see below, other problematic cases of the same kind can be found).

We illustrate this dilemma using the degree denoting account of numerals. On such an account, we would interpret “zero” as in (34).

(34) 〚zero〛 = 0

In combination with MANY, it yields the determiner meaning in (35):

(35) λA.λB.∃x[#x = 0 ∧ *A(x) ∧ *B(x)]

Given this, “zero students passed the test” is assigned the truth-conditions in (36).

(36) x[#x = 0 ∧ *student(x) ∧ *pass-the-test(x)]

This is a contradiction. The reason is that the following two requirements made by (36) clash. First of all, the only values for x that could satisfy pluralized predicates like *student(x) and *pass-the-test(x) are values from a semi-lattice. Second, the only value for x that could satisfy #x = 0 is ⊥. Since ⊥ is never part of the semi-lattices formed by the *-operation, these two requirements are irreconcilable.

This is, of course, an unwelcome result. We can very well imagine what it is like for it to be true that zero students passed. Using a full lattice, moreover, the result is not much better.

(37) x[#x = 0 ∧ ×student(x) ∧ ×pass-the-test(x)]

No matter what the extensions of student and pass-the-test is, the full lattices formed by applying × to them are always going to contain ⊥. So, (37) is always true: just take x = ⊥.

Note that the modificational view, on which “zero” would denote the property of having 0 cardinality (λx.#x = 0), gives exactly the same result – it would also assign (36) or (37) to “zero students passed the test”.9

While in section 2 we presented arguments that “zero” is not a quantifier, we now have seen that giving “zero” a numeral semantics creates significant problems. As we will show next, however, once we take exactly readings of numerals into account and derive them from at least readings, things start falling into place.

4 Proposal for “zero”

As we will argue now, the issues for a numeral analysis of “zero” raised above can be resolved once we take into account the fact that numerals alternate between at least and exactly readings, and once we allow the exactly reading to be derived from the at least one. In fact, in this section we will show that the numeral analysis precisely predicts the way “zero” is of different polarity than “no”.

4.1 Tautological semantics and exhaustification

Recall that on an at least reading account of numerals (both on the modificational and the degree-denoting account), (38) can be given the semantics in (39).

(38) Zero MANY students passed the test.

(39) x[#x = 0 & ×student(x) ∧ ×pass-the-test(x)]

As we explained, as it stands this is a trivial statement: it is true irrespective of how many students passed, since any property ×P holds of the bottom element. This means it is an at least meaning, compatible with there being a y s.t. #y > 0 & ×student(y) & ×pass-the-test(y). And, so, in other words, (39) says that zero or more students passed the test.

Like with other numerals, however, an exactly meaning can be derived from the at least meaning by exhaustification. Here, and in what follows, we will assume that this meaning comes about via a silent operator EXH, defined along the lines of (Chierchia 2004; Chierchia et al. 2013; Fox & Spector 2018).10

Since at sentence-level “zero” means “zero or more”, other numerals offer stronger statements (“one or more”, “two or more” etc.). Exhaustification denies all such stronger statements (the meaning component added by exhaustification is underlined):

(40) EXH Zero students passed〛 = ∃x[#x = 0 ∧ ×student(x) ∧ ×pass-the-test(x)] ∧
                                                        ¬∃y[#y > 0 & ×student(y) ∧ ×pass-the-test(y)]

This now states that there are zero or more students that passed the test, but that there are not more than 0. In other words, the number of passing students is exactly 0: everyone failed.

Unlike other numerals, “zero” invokes exhaustification obligatorily. This is for purely pragmatic reasons. Quite simply, statements with “zero” are semantically defective without exhaustification. On our syntactic view on exhaustification, interpretation of a sentence with “zero” always yields two meanings, a defective one derived without EXH and a non-defective one with EXH. That latter meaning will always be the only one to surface (provided it is not itself defective for independent reasons).

In contrast to other scalar implicatures, the “not more than 0” component of exhaustified “zero” does not disappear in embedded contexts. While (41) strongly implicates that John does not take both sugar and milk, i.e. that he takes just milk or just sugar, (42) does not mean that nobody takes just sugar or just milk. Instead, it means the stronger nobody takes either. This is standard behaviour for scalar terms. Weak scalar terms, like disjunction, trigger implicatures to make them more informative. Yet, in certain embedded positions, like in the scope of negation, for instance, their weak semantics ends up being very strong because of the scale reversal introduced by the higher operator. Weak scalar terms are strong in downward entailing contexts and this is why they fail to trigger implicatures in such contexts.

(41) John takes sugar or milk.

(42) Nobody takes sugar or milk.

(43) Nobody read zero books.

The example in (43) differs from (42) in that the exactly implicature is still in place, even though “zero” is in the scope of a downward entailing operator. This is simply because in contrast to other scalar terms, the semantics of “zero” is not more informative in downward entailing contexts. The negation of a tautology is just as defective as the tautology itself.

Crucially, having exhaustification rescue the defective semantics of “zero” is not the same as assigning “zero” lexical semantics that amounts to “exactly zero”. As we will see in the next section, detaching the “exactly” (EXH) component from the word “zero” plays an essential role in our analysis of the polarity of “zero”.

4.2 “Zero” and negative polarity items

As we observed in section 2, “zero” appears not to be able to license negative polarity items, in contrast to “no”. As we will explain in this section, adopting an at least semantics for “zero” and other numerals provides a way to approach this contrast. Under the view we adopted above, “zero” differs semantically from “no” in that its negative effect comes about via exhaustification. As we will show, this difference can indeed account for the differences in NPI licensing. Let us first introduce our assumptions.

Traditionally, NPIs are thought to be licensed in downward entailing (DE) environments (Ladusaw 1979). We follow Gajewski (2011) in assuming that an NPI is licensed when the environment it occurs in is DE. Assuming an exhaustification operator as we have done above, this licensing condition has two parts:

(44) Two licensing conditions for NPIs (Gajewski 2011)
        Given a structure [αEXH [β … [γ NPI ]…]]:
                   Condition 1: the environment γ is DE in β
                   Condition 2: the environment γ is DE in α

These two conditions are needed to account for differences between weak NPIs (for instance, “any”) and strong NPIs (for instance, “either”, “in years”).

(45) a. Weak NPIs are subject to Condition 1.
  b. Strong NPIs are subject to both Condition 1 and Condition 2.

To illustrate, consider the case of “few”. “Few” licenses weak NPIs, as in (46), but not strong ones, (47).

(46) No/Few students read any books by Auster.

(47) No/*Few students have visited me in years.

“Few” is downward entailing in its second argument. If it is true that “few” students read City of Glass, then it will also be true that “few” students read City of Glass twice. For any statement “few A B” there is a stronger alternative statement “no A B”, though. This means that the exhaustified meaning of “few A B” will make it false that “no A B” and thus true that “some A B”. In other words, after exhaustification “few” is no longer downward entailing. If nobody read City of Glass, then “few students read City of Glass” is true on the non-exhaustified and false on the exhaustified reading. In terms of Gajewski’s licensing conditions this means the following (assuming no polarity reversal intervenes between “few” and the NPI):

(48) [αEXH [β Few NP … [γ NPI ]…]]
  a.    γ is DE in β
  b.    γ is not DE in α

Given the distinction in licensing conditions between weak and strong NPIs stated in (45), this setup now correctly predicts that “few” licenses weak, but not strong NPIs.

If we apply this approach to “zero”, we are not immediately going to be successful. Given our at least + EXH semantics, “zero” creates the following entailment patterns. (Again we are assuming nothing affecting polarity intervenes between “zero” and the NPI.)

(49) [αEXH [β Zero NP … [γ NPI ]…]]
  a.    γ is DE in α
  b.    γ is DE in β

“Zero” is clearly downward entailing after exhaustification. If exactly 0 students read City of Glass, then exactly 0 students read it twice. The non-exhaustified version is a bit trickier. On our approach, “zero” has a defective semantics. Any statement of the form “zero A B” is – without exhaustification – a tautology. In other words, “zero A B” is a tautology, but “zero A C” is also a tautology, irrespective of our choices of A, B, and C. Let ⊤ be the tautological proposition, the proposition that is always true. Obviously, ⊤ entails ⊤. Now choose B and C such that the extension of C is a proper subset of that of B. Since “zero A B” and “zero A C” both express ⊤, they entail each other. Since the extension of C is contained in that of B, “zero” must be both downward entailing and upward entailing.

Given the situation in (49), we wrongly predict “zero” to license both kinds of NPIs. It is well-known, however, that the requirement of being in a downward entailing environment is at times too lax. In quite a few places in the literature, it is suggested that not only should NPI environments be downward entailing, they should also not be upward entailing (Progovac 1993; Lahiri 1998; Gajewski & Hsieh 2014; Barker 2017). One illustration of this comes from singular definite descriptions. Since definite descriptions are presuppositional, the relevant notion of entailment we need is so-called Strawson entailment (Von Fintel 1999). A premise Strawson-entails a conclusion, if the premise together with the presuppositions of the conclusion entail the conclusion. Given such a definition, the restrictor argument of a singular definite description is Strawson DE, as illustrated in (50).

(50) The student read City of Glass. premise
  There is a unique and salient happy student. presupposition of conclusion
  The happy student read City of Glass conclusion

Given the above setup, it would now be predicted that definite descriptions license (at least weak) NPIs. They do not:

(51) *The student that attended any class read City of Glass.

However, the restrictor argument is not just downward but also upward entailing (Lahiri 1998).

(52) The happy student read City of Glass. premise
  There is a unique and salient student. presupposition of conclusion
  The student read City of Glass. conclusion

Given such observations, we can revise our earlier statement of NPI licensing conditions, replacing downward entailingness by non-trivial downward entailingness (NTDE), i.e. being downward, and not upward entailing.

(53) Two licensing conditions for NPIs (revised)
        Given a structure [αEXH [β … [γ NPI ]…]]:
                  Condition 1: the environment γ is non-trivially DE in β
                  Condition 2: the environment γ is non-trivially DE in α

If we now return to “zero”, we see the following:

(54) [αEXH [β Zero NP … [γ NPI ]…]]
  a.     γ is NTDE in α
  b.     γ is DE but not NTDE in β

In other words, the case of “zero” is exactly the opposite of the case of “few”. Whereas the latter met condition 1 but not condition 2, “zero” meets condition 2 but not 1. Since both kinds of NPIs are subject to condition 1, we now correctly predict that “zero” licenses no NPIs of any kind.

Our account of the lack of NPI licensing by “zero” is dependent on our assumption that numerals come with an at least semantics, which is turned into an exactly meaning by the additional operation of exhaustification. If we had assumed that the exactly meaning is basic, then we would have had no account of the NPI licensing contrast between “zero” and “no”, since “zero” is non-trivially downward entailing on that reading.

In other words, two crucial ingredients have allowed us to explain the polarity profiles of “zero”: (i) the at least semantics of numerals; (ii) the inclusion of ⊥ in the domain of entities. The first of these is relatively uncontroversial. The second of these is more controversial than the first. For that reason, we now turn to the consequences of that second assumption.11

5 The bottom entity and triviality

With the assumption of the existence of a 0-quantity bottom entity, the at least semantics of “zero” becomes trivial. As we argued above, the observed polarity behaviour of “zero” follows from how this triviality is overcome. Even though, semantically, statements with “zero” are tautological, the scalar inferences they generate are not.

However, whereas we argued that triviality is a core part of how “zero” works, the inclusion of ⊥ leads to triviality much more generally. We need to now make the case that our proposal does not cause spurious triviality outside the domain of 0 numerals.

The first case to consider is that of bare plurals. If we simply interpret them as existential statements, using ∃, then triviality emerges. For instance, (55) when interpreted as (56) is predicted to be a tautology.

(55) There are typos in the manuscript.

(56) x[×typo(x) ∧ ×in-the-manuscript(x)]

Since ⊥ is in the extension of any predicate pluralised with ×, any statement of the form ∃x[×P(x)] will be true in any model. That is, the analysis should deliver not (56), but rather (57).

(57) x[#x > 0 ∧ ×typo(x) ∧ ×in-the-manuscript(x)]

We do not think this is a particularly serious problem. Since × renders ∃ no longer truly existential, we simply need a new, properly existential quantifier, combining ∃ with non-emptiness. Call this operator E, defined as in (58). The form in (59) is now the proper analysis of (55).

(58) Ex[φ] :⟺ ∃x[#x > 0 ∧ φ]

(59) Ex[×typo(x) ∧ ×in-the-manuscript(x)]

Importantly, whatever is responsible for introducing the existential entailment represented in (59), it will have to be different from the mechanism that introduces existential quantificational force for numerals. Obviously, if we were to use E as the existential force of “zero”, statements with “zero” would be predicted to be unsalvageable contradictions. If we want to maintain the assumption that there is only one kind of existential closure for all indefinite-like DPs, including numeral-noun combinations and bare plurals, then we would have to assume that the latter come with an empty determiner that contributes the exclusion of the bottom element.

It is important to point out, however, that it is not the case that ⊥ is semantically excluded from the interpretation of all expressions, except for numerals. A plural definite description “the X” will refer to ⊥ in worlds in which the extension of X is empty. So, (60) is predicted to be true in a situation without Australian students.

(60) The Australian students left the room.

This surely seems like an odd prediction to make and it is therefore tempting to conclude that the definite article must somehow semantically exclude the empty entity. Landman (2011), however, argues that in these cases ⊥ is excluded pragmatically via domain restriction. The definite description in a sentence like (60) is analysed along the lines of (61), where A is the set of Australians, S is the set of students and C is the contextual set of entities.

(61) that unique x×A×SC such that there is no y in the same set such that x is a part of y

Landman now suggests that C may or may not include the bottom entity. The interpretation of (60) is only non-trivial in case it does not. A pragmatic principle may help steer clear of tautological readings. Landman proposes the following maxim (Landman 2011: 14):

(62) Avoid Triviality: A contingent statement is better than a trivial one.

If the avoidance of triviality is pragmatic in nature, then we should be able to observe trivial readings in certain contexts. Landman mentions a few cases like that. In (63) (Landman 2011: ex. 2), he describes the relevant context as “Suppose I stand trial for fraud, and I say [(63a)] to the judge, but add sotto voce [(63b)] to you:”12

(63) a. Your honor, the persons who have come to me during 2004 with a winning lottery ticket have gotten a prize.
  b. Fortunately, I was on a polar expedition the whole year.

Note in particular the contrast to (64) (Landman 2011: ex. 9), which is fully infelicitous, as is to be expected on the assumption that the singular restricts the domain to atoms and, thus, semantically excludes the bottom entity.

(64) a. Your honor, the person who has come to me during 2004 with a winning lottery ticket has gotten a prize.
  b. Fortunately, I was on a polar expedition the whole year.

Inclusion of the bottom element in the denotation of plural predicates has an effect on one more class of expressions: downward monotone degree quantifiers – expressions like “fewer than ten”.

These expressions are often thought to denote sets of degree intervals (e.g. Kennedy (2015), who proposes (65); see also section 6 below). For instance, 〚fewer than ten〛 takes a property of numbers as its argument and states that there is no number ≥ 10 that has this property:

(65) 〚fewer than ten〛 = λPd,t .max(P) < 10

(66) 〚Fewer than ten MANY students passed the test〛 = max (λi.∃x[#x = i×student×pass-the-test]) < 10

Something that has not received much attention (but see Buccola & Spector 2016) is that unless we assume the bottom element, “fewer than ten” will not be downward monotone – the sentence “Fewer than ten students passed the test” will fail to come out true in a situation where no students passed the test. In such a situation, the bottom element will be the only element in the intersection of ×student and ×pass-the-test. Then, the interval defined in (66) will contain one number – namely, 0. 0 is indeed smaller than 10, and thus the sentence will come out true under the ⊥ assumption, as desired. Without ⊥ (i.e., using * instead of ×), the situation with no students passing the test will not make the sentence in (66) true and will make “fewer than ten” non-downward-monotone – not supporting downward scalar inferences.

The same reasoning holds for all downward monotone degree quantifiers without existential entailment – with or without existential implicatures: “fewer than”, “at most”, “few”, etc.

As we have argued, the inclusion of ⊥ in the domain of entities, although unorthodox, is not as problematic as it looked at first sight. In fact, it may be that analyses of numerals and modified numerals cannot do without this assumption.

We have assumed that ⊥ enters into the semantics via semantic pluralisation. All our examples were distributive, however, and, so, there is a clear alternative way of introducing the bottom element, not via pluralisation, but rather via some overt distributivity operator.13 As Buccola & Spector (2016) show, the exact mechanism behind the creation of the full lattice is worthwhile future research. They observe that collective predicates, unlike distributive ones, come with an existential entailment – and therefore, there is a contrast between examples with modified numerals such as (67a), with distributive predicates, and those like (67b), with collective predicates:

(67) a. Fewer than 100 students passed the test.
  b. Fewer than 100 soldiers surrounded the castle.

According to them, only the latter, but not the former, entails that at least one (non-empty) individual satisfies the VP predicate. Distributive predicates thus seem to have ⊥ in their denotation, while collective predicates seem to lack it.

If this is indeed the case the following dilemma emerges: either a bottom element is, as we suggested in this paper, an inherent part of the denotation of all plural predicates (formed by ×) that is somehow removed from the denotation of collective ones – or plural predicates systematically lack ⊥ in the denotation (and are thus formed by *), and then it is added by a distributivity operator that is distinct from a pluralisation operator.

To resolve this dilemma, one has to look for cases that require the presence of the bottom element – but do not involve distributive predicates. “Zero” can provide exactly the case needed. If sentences with “zero” and collective predicates are well-formed and can be judged true in a scenario where no (group) individual satisfies the collective predicate, this could be seen as argument against introducing ⊥ by the distributivity operator:

(68) Zero soldiers surrounded the castle.

Unfortunately, the judgements concerning such cases are not clear to us, and, for that reason, we leave this dilemma for future work.

6 An alternative analysis?

Before concluding, we explore a potential alternative to our proposal. This alternative relies on an analysis of bare numerals that is substantially different from the non-quantificational ones discussed in section 3.

In their standard implementations, the modificational and degree type view on bare numerals we discussed above give an at least semantics to numerals and derive the exactly reading with the help of additional mechanisms (most often, exhaustivity). An alternative, however, is to treat the exactly reading of numerals as basic and have some mechanism derive the at least reading. We now turn to such an analysis.

Kennedy’s so-called de-Fregean account of bare numerals (Kennedy 2015) is an extension of the analysis of quantifiers like “fewer than 10” that we discussed above. As in the degree-type view, Kennedy’s approach makes use of the counting quantifier MANY (Hackl 2000), but rather than having numerals denote numbers directly, the numerals are taken to denote quantifiers over numbers. So, while on this approach cardinalities are of type d of degrees, numerals express meanings of type ⟨⟨d,t⟩,t⟩. On this account, “three” expresses a property of sets of cardinalities: it is true only of those sets of numbers that have 3 as their maximum.

(69) 〚three〛 = λP.max(P) = 3

Since MANY takes a number as its argument and numerals provide a degree quantifier, the numeral will have to raise, leaving behind a trace of type d. For our running example, “Three men lifted the piano”, we get the structure in (70) and the interpretation in (71).

(70) [ three [ λi [ [ [ tiMANY ] men ] lifted the piano ] ] ]

(71) max(λi.∃x[#x = i ∧ *men(x) ∧ *lift-the-piano(x)]) = 3

Here, the maximality operator looks at a set that collects those numbers i such that there are at least i men lifting the piano. In a world in which exactly three men (each) lifted the piano, this set is going to be {1, 2, 3}. The maximum will then be 3 and the sentence is correctly predicted to be true. In case there are more men lifting the piano, the maximum will be higher and the sentence is predicted to be false. This means that (71) is the exactly reading of the numeral.

Under Kennedy’s analysis, at least readings are derived from the exactly readings. The derivation involves type shifting the degree quantifier λP.max(P) = 3 to the degree denoting 3. From there, the at least reading is derived straightforwardly in combination with MANY, just like it was in the account that takes the degree meaning as basic.

With all this in mind, we will now explore what a theory of numeral “zero” would look like in an approach like Kennedy’s.

6.1 De-Fregean “zero”

On the Kennedy (2015) view, the sentence “Zero students passed the test” would yield (72) or (73).

(72) max(λi.∃x[#x = i ∧ *student(x) ∧ *pass-the-test(x)]) = 0

(73) max(λi.∃x[#x = i×student(x) ∧ ×pass-the-test(x)]) = 0

In order to see what these formulas express, we first need to understand which sets are described by λi.∃x[#x = i ∧ *student(x) ∧ *pass-the-test(x)] and λi.∃x[#x = i×student(x) ∧ ×pass-the-test(x)]. In any world with at least some students passing the test, the former lambda term will describe the set {1, …, n}, where n is the number of students passing the test. The latter lambda term, the scope of the max operator in (73), adds 0 to that set: {0, …, n}. In both cases, the maximum of the set is clearly not 0, and so both (72) and (73) correctly predict that “Zero students passed the test” is false whenever some students passed.

In a world with no students passing the test, things are different. The lambda term in (72) now yields the empty set. Since there is no student who passed the test, there is no x that satisfies the *-pluralised predicates, and so there is no number that corresponds to the cardinality of such an x. However, the lambda term in (73) does not describe the empty set in a world in which no students pass the test. This is because there is an x satisfying the ×-pluralised predicates, namely x = ⊥, and so there is exactly one number in this set, namely 0. In other words, in such a world, (72) and (73) correspond to the propositions max(∅) = 0 and max({0}) = 0, respectively.

Clearly, then, (73) yields the correct truth-conditions. What about (72)? Under quite standard assumptions, we would have to assume that this semantic analysis fails. The reason is that max(X) is normally defined as returning the unique element in X such that no other element exceeds it on the relevant scale. Taking the maximum from the empty set, as we would need to do for (72) is simply undefined.

However, one could alter the definition of max by stipulating that the maximum of the empty set is the bottom element of the scale used by the operator, 0 in this case.14 If we do, then (72) and (73) become equivalent, since now max(∅) = max({0}) = 0.

Importantly, the option of stipulating that max(∅) = 0 opens up a route to talk about zero cardinalities without assuming the existence of ⊥. The fact that (72) and (73) are now equivalent means that we have found a semantics of numeral “zero” that does not depend on extending our ontology with the bottom element. Does this mean that a de-Fregean analysis is a contender for a full, yet ontologically light theory of numeral “zero”?

To answer this, we will look at two cases: the scopal ambiguity (or lack thereof) of sentences with “zero” and the polarity data discussed above.

6.2 “Zero” and scope

If, as our own proposal has it, the exactly reading of “zero” comes about via exhaustification, then we could expect to see the effects of the exhaustification operator engaging in scope relations with other operators. Take (74), for example.

(74) The company has to fire zero employees.

On our analysis, we expect this to have two possible logical forms, (75a) and (75b).

(75) a. [ ◻ [ EXH [ the company fires zero employees ] ] ]
  b. [ EXH [ ◻ [ the company fires zero employees ] ] ]

These correspond to two truth-conditionally different readings: (76a) says that the company is not allowed to fire anyone, while (76b) says that the company does not need to fire anyone (this kind of reading is often referred to as the “split scope” reading for reasons beyond our immediate concern, although see Jacobs 1980; Rullmann 1995; De Swart 2000; Penka & Zeijlstra 2005; Abels & Martí 2010; Penka 2011). Both readings are attested, so our proposal seems to make a valuable prediction here.

The de-Fregean analysis is no different, however. Since on that account “zero” is a degree quantifier, it can take scope at different sites. As with our own proposal, again two logical forms are possible for (74):

(76) a. [ ◻ [ zero [ the company fire tMANY employees ] ] ]
  b. [ zero [ ◻ [ the company fire tMANY employees ] ] ]

The meanings expressed by (76a) and (76b) correspond exactly to those of (75a) and (75b), respectively. It turns out then that it is going to be very hard to distinguish between the de-Fregean account and ours on the basis of scope matters: the readings generated by the scope flexibility of the exhaustivity operator are exactly the same as those created by the different QR landing sites that degree quantifier “zero” may inhabit.

The situation is different when it comes to sentences with non-modal quantifiers.15 In a context with a nominal quantifier, such as “every student”, our proposal predicts two logical forms to be available, quite like with the modal quantifier:

(77) Every student read zero books.
  a. [ every student [ exh [ t read zero books ] ] ]
  b. [ exh [ every student [ t read zero books ] ] ]

The latter LF (77b) amounts to a weak, “split-scope”, reading according to which it is not the case that every student read one book or more – it should be compatible with the situation where one of the students read some non-zero amount of books. Contrary to the prediction our analysis makes, the sentence in (77) does not have this reading – as shown by the unavailability of a continuation like in (78):

(78) Every student read zero books. #Mary (even) read four.

The lack of the reading corresponding to the LF in (77b) thus posits a challenge for our proposal. Kennedy’s (2015) analysis, to the contrary, predicts the unavailability of this reading. Two candidate LFs for sentence (77) under the de-Fregean theory would be the following – depending on the landing site of QRed degree quantifier “zero”:

(79) a. [ [ every student ]1 [ zero2 [ t1 read t2MANY books ] ] ]
  b. [ zero2 [ [ every student ]1 [ t1 read t2MANY books ] ] ]

But only one of these logical forms is predicted to be viable – the one to be filtered out is (79b), corresponding to the unattested “split scope” reading. (79b) is ill-formed according to what is known as the Heim-Kennedy generalization (Kennedy 1997; Heim 2000), which states that nominal quantifiers can never intervene between a degree quantifier and its trace:

(80) *[ DdttQetttd ]

It is easy to see that (79b) realizes the prohibited scheme (80): it involves a degree quantifier (“zero”) QRed over a nominal quantifier (“every student”). Therefore, in an analysis in which “zero” is a degree quantifier and scope ambiguities are derived via QR of “zero” to different positions, the asymmetry between modal and nominal quantifiers can be traced back to (80). Kennedy (2015) is one such theory, while our proposal is not.

As it stands, things are as follows. We assumed a lower-bounded semantics for numerals, which allowed us to explain the lack of NPI licensing for “zero” in terms of its indirectly derived negative meaning, namely via exhaustification. That same mechanism now poses a problem for our theory, since we have not provided sufficient constraints on exhaustification to prevent over-generation. We will have nothing to say on how to solve this issue, but instead we will point out a dilemma. While the at least semantics for “zero” over-generates readings, it provides a neat explanation for the polarity data. The de-Fregean, doubly bounded, theory of numerals, on the other hand, has a salient remedy against over-generation via the Heim-Kennedy generalization.16 As we will explain next, the degree quantifier approach to “zero” has little to no hope to provide an explanation for the NPI data.

6.3 Polarity revisited

As we observed in section 2.2, “zero” doesn’t seem to be grammatically negative in the same way as, for example, negative indefinites are. Unlike “no”, “zero” in subject position doesn’t license positive tag questions, doesn’t trigger negative inversion and – the case we focus on – doesn’t license NPIs (we repeat examples (13) and (14)):

(81) No student has visited me in years.

(82) *Zero students have visited me in years.

In section 4.2 we argued that adopting an at least semantics for numerals, including “zero”, allows us to account for this lack of NPI licensing.

Assuming that the structure of sentences with “zero” involves EXH attached to the structure corresponding to the at least reading, we formulated NPI licensing conditions for this configuration (following Gajewski 2011):

(83) Two licensing conditions for NPIs
     Given a structure [αEXH [β … [γ NPI ]…]]:
                Condition 1: the environment γ is non-trivially DE in β
                Condition 2: the environment γ is non-trivially DE in α

All NPIs are subject to condition 1 – domain β has to be non-trivially downward-entailing. Environments with “zero” before exhaustification don’t satisfy this condition – they are both upward- and downward-entailing. Thus, NPIs are not predicted to be licensed – as desired.

The difference between exactly and at least analyses of numerals is that the LFs generated by the former crucially lack the structural point corresponding to the split between domains α and β in (83). A doubly bounded semantics will not guarantee the presence of a β-environment that violates Condition 1. This means that exactly analyses of “zero” will always predict NPI licensing. This prediction is independent of further assumptions (the exact definition of maximality or presence or absence of the bottom element).

The degree quantifier analysis (Kennedy 2015) is an exactly analysis. This fact is fatal for capturing the NPI facts – what gets the data right is the lack of a specific configuration in the language inventory – namely, the lack of configuration in which numerals (including “zero”) have an exactly semantics lexically. As long as this option is available, NPIs are incorrectly predicted to be licensed, whether there is an additional at least + EXH option or not. The polarity data are deeply problematic for the de-Fregean analysis.

There is no clear way for the de-Fregean analysis to account for the NPI data. The only potential way to get numeral “zero” to satisfy the conditions above is to detach maximality from the numeral after all and treat it as a kind of exhaustification operator. The numeral itself would be a degree quantifier with an at least semantics.

This raises a number of concerns. First and foremost, deriving at least readings as basic via at least degree quantification would undermine the very idea behind degree quantifier semantics for numerals. If numerals are at least degree quantifiers (and there is no other way to derive at least readings), the exclusively exactly readings of numerals in non-existential contexts will be left unaccounted for.17 An additional mechanism deriving in situ readings would be needed for these cases – this would be the at least type-shift. But if there are two distinct mechanisms for deriving at least readings, a number of new questions arise as to how to constrain each of these derivations both in terms of interaction with their syntactic and semantic environment and, potentially, in terms of their competition with each other, to avoid over-generation. Finally, whether NPI licensing conditions would end up directly applicable to this hypothesised new structure without further stipulation also depends on the details of the implementation, which is beyond the scope of this paper.

Summing up, polarity facts, as we argue in this section, are fundamentally problematic for a degree quantifier analysis of “zero” – but fall out naturally under our analysis.

7 Conclusion

We have conducted the first in-depth study of the semantics of “zero”. As we have hinted at in several places above, the semantic literature has occasionally touched upon the relevance of “zero” to matters of negation and polarity licensing. We have built on some of the observations already present in the literature, and have offered a predictive account of these that is fully conservative in the sense that we give “zero” a numeral semantics, just like other number words. In addition, we have shown that “zero” is not just relevant to matters of negation, but also to plurality and, in particular, to assumptions about semantic ontology.