Fox (1998; 2000: 113–137) introduces Rule H, which requires that bound pronouns be bound as locally as possible. Fox exploits Rule H in his analysis of three classes of phenomena:

  1. Strong Crossover.

  2. The ban on using co-binding to “sneak around” Condition B (together with certain exceptional cases).

  3. The Dahl paradigm (Dahl 1973, 1974), and a number of related restrictions on the interpretation of pronouns in elided VPs.

The focus of this paper is Fox’s analysis of class (iii) phenomena in terms of Rule H. This analysis faces two main problems. The first, pointed out by Heim (2008), is that the analysis relies on a form of the Parallelism constraint on VP ellipsis that lacks independent motivation.1 The second problem, raised by Roelofsen (2011), is that Rule H cannot account for certain quantificational variations on the Dahl paradigm, nor for the apparent availability of co-binding in certain configurations.

I show that Fox’s analysis can be tweaked to solve the preceding problems. The key component of my analysis is the hypothesis that Rule H acts as a filter on Focus Semantic Values (FSVs). Given this hypothesis, Parallelism can be replaced by an independently-motivated contrast constraint on VP ellipsis of the type proposed in Rooth (1992). The paper is organized as follows. Section 2 gives some background on Rule H. Section 3 outlines a number of problems relating to the Parallelism constraint on VP ellipsis and introduces my proposed replacement for Parallelism. Section 4 explains how this new constraint on VP ellipsis accounts for the Dahl paradigm. Sections 5–7 extend the analysis to various additional binding and ellipsis phenomena. Section 8 presents a presuppositional reformulation of Rule H designed to deal with certain co-binding structures that are problematic for Fox’s analysis. Throughout the text I assume that focus alternatives are derived via syntactic substitution. The Appendix shows how this assumption can be disposed with.

I assume that binding and coreference dependencies are represented along the lines proposed by Heim & Kratzer (1998). Each DP starts out with a freely-assigned index. When a DP moves, a λ-node is adjoined immediately below the landing site and an arbitrary index is chosen for the λ-node and the trace. If the λ-node c-commands and is non-vacuously coindexed with a pronoun, then the pronoun is bound as a variable. A DP must move in order to bind a pronoun as a variable via a λ-node. This movement may be QR, or A-movement from the VP-internal subject position to Spec,TP.


Fox (2000: 115) defines Rule H as follows:

(1) Rule H
  A pronoun A can be bound by an antecedent B only if there is no closer potential antecedent C such that it is possible to bind A by C and get the same interpretation.
  (C is closer if B c-commands C and C c-commands A.)

As is evident from (1), evaluation of Rule H proceeds via the construction of a competitor LF where binding is more local than in the original. The principal effect of Rule H is to block the following configurations:2

(2) Co-binding
(3) Binding across a coreferential expression

The co-binding LF in (2) is blocked by the interpretatively equivalent transitive binding LF in (4). The LF in (3) is blocked by the interpretatively equivalent LF in (5):

(4) Transitive binding
    1. (5)

Fox assumes that VP ellipsis is constrained by Parallelism, which is the disjunction of Referential Parallelism and Structural Parallelism. A referential pronoun in an elided VP satisfies Referential Parallelism iff it refers to the same individual as the corresponding pronoun in the antecedent VP. A bound pronoun in an elided VP satisfies Structural Parallelism iff it is bound in a manner structurally parallel to its counterpart in the antecedent VP. There is some question as to how both Referential and Structural Parallelism might be made more precise. However, since my analysis will not make use of either constraint, and since the application of these constraints is clear enough in the cases at hand, I will not attempt to elaborate them any further.


Dahl (1973; 1974) observes that the interpretation of the elided VP in (6) is restricted in a surprising way. When both pronouns in the first conjunct are anteceded by John, the pronouns in the elided VP may receive either strict or sloppy readings. However, the second pronoun may receive a sloppy reading only if the first does also:

(6) John knows he loves his mother and BillF does too.
(7) John knows John loves John’s mother and
  a.   strict-strict
      … Bill knows Bill loves Bill’s mother.
  b.   sloppy-sloppy
      … Bill knows John loves John’s mother.
  c.   sloppy-strict
      … Bill knows Bill loves John’s mother.
  d.   strict-sloppy
    *… Bill knows John loves Bill’s mother.

The key observation underlying Fox’s analysis of the Dahl paradigm is that each of readings (7a)–(7c) can be derived without using non-local binding. That is, none of the binding dependencies in (8) crosses a closer potential antecedent:

(8) a. John1 knows he1 loves his1 mother, and (7a)
    BillF does [know he1 loves his1 mother] too  
  b. (7b)
  c. (7c)

Since Rule H is triggered by the presence of non-local binding configurations, it is clearly not violated in any of the LFs in (8). In contrast, (7d) can only be derived using non-local binding, as in (9)–(10). Closer potential antecedents are shown in bold:

(9) (*Parallelism)
(10) (*Rule H)

Rule H must be satisfied for each conjunct of (9) and (10). In (9), non-local binding of [his3] by [Bill] does not give rise to a violation of Rule H, since replacing [his3] with a variable bound by the closer potential antecedent [he1] yields a distinct interpretation for the second conjunct. This binding dependency must satisfy Structural Parallelism, but it is not matched by a structurally parallel binding dependency in the first conjunct. Structural Parallelism is satisfied in (10), but the first conjunct violates Rule H, since replacing [his2] with a variable bound by the closer potential antecedent [he1] yields the same interpretation. Thus, it is impossible to derive reading (7d) without violating either Parallelism or Rule H.

Fox discusses a number of other ellipsis phenomena where the pattern of available interpretations is correctly predicted by Rule H. I will not give an exhaustive summary here, since my revised theory retains Rule H, and I do not propose any significant modification to Fox’s analysis of these data.


A typical Strong Crossover configuration is shown in (11):

    1. (11)

The co-binding configuration in (11) is blocked by Rule H. Binding t1 by he1 yields (12), which has the same interpretation as (11). This example illustrates the point that Rule H applies even if the competing LF is one that could not be the output of a licit syntactic derivation.

    1. (12)

Is Rule H the sole and sufficient principle required to account for for SCO effects? This seems to be what Fox has in mind, given his suggestion (p. 124, fn.14) that Rule H can also account for Condition C violations involving proper names. Much depends on background assumptions regarding the nature of c-command-sensitive non-coreference effects, and regarding the relation between Weak and Strong Crossover. I return to these issues briefly in section 8.


Assume the following formulation of Condition B:

(13) A pronoun cannot be semantically bound3 by a local c-commanding antecedent.

It is easy to ‘sneak around’ this formulation of Condition B using co-binding. There is no Condition B violation in (14), for example, because [he1] does not semantically bind [him1]:

(14) *Every boy [λ1 [t1 said he1 loves him1]]

The co-binding configuration in (14) does, however, violate Rule H. Rule H thus makes it possible to retain an attractively simple formulation of Condition B in terms of semantic binding. This contrasts with the rather complex formulation of Condition B proposed in Heim (1998) in light of (14) and related data.

If (13) is defined in terms of syntactic binding (c-command plus co-indexation) rather than semantic binding, then (14) is blocked directly by Condition B. However, more complex examples such as (15) can be constructed where the pronoun is not coindexed with its local antecedent (Bach & Partee 1980):

    1. (15)

Rule H blocks binding of [him2] by the first instance of [he1] (since the second instance of [he1] is a closer potential antecedent).

Heim (1998) argues that there are certain exceptional situations in which co-binding can in fact be used to obviate Condition B. In particular, this is possible when co-binding yields an interpretation that could not be derived using transitive binding. If Condition B is defined, as in (13), in terms of semantic binding, then Rule H predicts precisely this generalization. Rule H forces transitive binding in preference to co-binding only when both yield the same interpretation.4


Fox’s disjunctive Parallelism constraint does not follow from independently motivated constraints on VP ellipsis. In particular, constraints on VP ellipsis stemming from the proposals in Rooth (1985; 1992) do not enforce a strict structural parallelism requirement on bound pronouns within elided VPs. The licensing constraint that Rooth proposes on VP ellipsis can be formulated as in (17), following Heim (1997).5 There are many proposals in the literature regarding exactly how Focus Semantic Values should be defined. I assume the definition in terms of syntactic substitution given in (16).

(16) Focus Semantic Value (FSV)
  The Focus Semantic Value of a constituent φ for an assignment g, written FSVg (φ), is the set of ⟦φ′⟧g such that ⟦φ′⟧g is defined and φ′ can be obtained from φ by replacing all its focused subconstituents with unfocused constituents of the same semantic type.
(17) Rooth-Style Contrast Constraint (RSCC)
  For ellipsis of a VP φ to be licensed in an utterance context C, there must be a constituent φ′ containing φ, and an antecedent constituent ψ, such that for all assignments g extending Cg, ⟦ψg is contained in FSVg (φ′).
  (Cg is the assignment determined by the utterance context C.)

The following discourse illustrates the application of (16)–(17).

(18)   Cg = {1 ↦ John, …}
  a. He1 smokes.
  b. [Mary]F does [VP smoke] too.

Ellipsis of the VP in (18b) must satisfy RSCC. To see that it does, choose φ′ = (18b) and ψ = (18a). The FSV of (18b) has the following members:

(19) FSVg ((18b)) =
  ⟦[Mary] does [VP smoke]⟧g
  ⟦[John] does [VP smoke]⟧g
  ⟦[Jane] does [VP smoke]⟧g

The assignment Cg given by the utterance context includes 1 ↦ John, so RSCC considers only assignments g such that g(1) = John. For any such assignment, ⟦[John] does [VP smoke]⟧g is equal to ⟦(18b)⟧g. As ⟦[John] does [VP smoke]⟧g is a member of FSVg ((18b)), RSCC is satisfied and VP ellipsis in (18b) is licensed.

RSCC must be interpreted in conjunction with Heim’s (1997) ban on meaningless coindexing.6 Without such a constraint, arbitrary choices of indexation that have no effect on interpretation become relevant when RSCC is computed. For example, the ungrammatical instance of VP ellipsis in (20c) is not licensed with respect to α in (20b), but is licensed with respect to α in (20a):

(20) a. No boy [λ1 [t1 said [α Jane likes him1]]]…
  b. No boy [λ2 [t2 said [α Jane likes him2]]]…
  c. … and in fact # [Mary]F does [VP like him1].

The ban on meaningless coindexing blocks the LF formed by joining (20c) to (20a), thus ensuring that VP ellipsis is not incorrectly licensed.

Fox relies on Parallelism to block LFs such as (21). The first conjunct of (21) uses transitive binding. The second conjunct, in violation of Parallelism, has [his] bound long-distance by [Bill]:

(21) (*Parallelism)

The assignment Cg given by the utterance context includes 1 ↦ John. Thus, under all assignments that extend Cg, the first pronoun in the second conjunct is coreferential with [John] and the second is bound by [BillF]. Once [BillF] is replaced by its alternative [John], the first and second conjuncts have, for all assignments that extend Cg, the same semantic value (the proposition ‘John knows John knows John’s mother’). RSCC is therefore satisfied. This result is empirically significant as (21) does not violate Rule H, and Fox relies on Structural Parallelism to block it.

Why exactly does Fox’s analysis of the Dahl paradigm require the Parallelism constraint? The essential reason is the following. It is only in the first conjunct of (7) that there is any possibility of local and non-local binding giving rise to the same interpretation (thereby triggering a violation of Rule H); and yet it is in the second conjunct that non-local binding must be blocked in order to rule out the unattested interpretation (7d). The role of Parallelism is to ‘translate’ the ban on non-local binding in the first conjunct over to the second conjunct. If there were a means to block non-local binding directly in the second conjunct, then Parallelism would no longer be required.

Given a broadly Roothean theory of focus and ellipsis licensing, it is in fact possible to arrange for all the theoretical action to take place in the second conjunct of (7). The precise pattern of binding dependencies in the first conjunct is then irrelevant — except insofar as it affects the proposition expressed. The key idea is to replace Parallelism with RSCC while imposing a further constraint in the definition of Focus Semantic Value. The revised definition of FSV is as follows:

(22) Strict Focus Semantic Value (SFSV)
  The SFSV of a constituent φ for an assignment g, written SFSVg(φ), is the set of ⟦φ′⟧g such that φ′ does not violate Rule H for g, and φ′ can be obtained from φ by replacing its focused subconstituents with constituents of the same semantic type.
(23) A constituent φ violates Rule H for an assignment g iff there are A,B,C within φ such that A is a pronoun bound by B across the closer potential antecedent C, and and the LF φ′ derived by binding A by C is such that ⟦φg = ⟦φ′⟧g.

I will use the abbreviation ‘RSCC+FSV’ to refer to RSCC when interpreted in relation to the definition of Focus Semantic Value in (16), and ‘RSCC+SFSV’ to refer to RSCC when interpreted in relation to the definition of Strict Focus Semantic Value in (22). The conjunction of Rule H and RSCC+SFSV allows certain structural mismatches between elided and antecedent VPs that are not permitted by Parallelism. However, the requirement that the members of a constituent’s Focus Semantic Value derive from structures that have maximally local binding has the consequence that there are certain configurations where Parallelism is satisfied and yet RSCC+SFSV is violated.


The LF in (9), repeated here as (24), yields the unattested reading (7d) of the Dahl paradigm. This LF satisfies Rule H but violates Parallelism:

(24) (*Parallelism)

RSCC+FSV is satisfied in (24), since [John] can substitute for [Bill] and α is therefore contained in the FSV of β. The conjunction of Rule H with RSCC+FSV thus fails to block the unattested reading (7d). If, however, SFSVs are used instead of FSVs, (24) is ruled out. To see this, note that in order for ellipsis to be licensed in (24), the following proposition n must be a member of the SFSV of β:

(25) John knows John loves John’s mother.

A sentence denoting this proposition can be derived from β by replacing [BillF] with [John], yielding the structure in (26). However, this structure violates Rule H under all assignments, since its competitor (27) has the same interpretation for any given assignment that maps 1 to John:

(26) John [λ3 [t3 does [know he1 loves his3 mother] too]]
(27) John [λ4 [t4 did say he15 [t5 loves his5 mother]] too]]

Are there any other LFs that derive reading (7d) and also satisfy Rule H and RSCC+SFSV? The first pronoun in the second conjunct of (24) must refer to John if reading (7d) is to be derived. The second pronoun can therefore pick out Bill either by referring to Bill or by being bound by [Bill] (since being bound by the first pronoun would cause it to pick out John instead). Whatever the pattern of binding/coreference in the first conjunct, RSCC is violated if [his] is coreferential with [Bill], since none of the members of the FSV of the second conjunct then has John’s mother as the object of love. Thus, to derive reading (7d), the first pronoun in the second conjunct must refer to John and the second must be bound by [Bill]. And as we have just seen, the SFSV of the second conjunct will then lack the member necessary to license ellipsis given RSCC+SFSV.

Readings (7a)–(7c) can all be derived without using non-local binding (either in the original LF or in the LFs used to derive members of the relevant FSVs). Replacing FSV with SFSV therefore makes no difference for these examples.


Roelofsen (2011) discusses a variation on the original Dahl paradigm where the referential subject DPs are replaced by pronouns bound by a higher quantifier. The pattern of available and unavailable readings remains abstractly the same, as illustrated in (28)–(29):

(28) Every worker says he knows how he broke his tools, and that the boss does too.
(29) a.   sloppy-sloppy
      … the boss knows how the boss broke the boss’s tools.
  b.   strict-strict
      … the boss knows how the worker broke the worker’s tools.
  c.   sloppy-strict
      … the boss knows how the boss broke the worker’s tools.
  d.   strict-sloppy
    *… the boss knows how the worker broke the boss’s tools.

Roelofsen notes that Fox’s analysis of the original Dahl paradigm does not extend successfully to the embedded Dahl paradigm. The only reading predicted to be available is the sloppy-sloppy reading in (29a). This reading can be derived in accord with Rule H and Parallelism using transitive binding throughout:

    1. (30)

None of the other readings in (29) can be generated without violating at least one of Rule H and Parallelism. To derive the strict-strict reading (29b) while respecting Parallelism, it is necessary to have the first and second pronouns in the first conjunct bound by every worker, but this gives rise to a co-binding configuration that violates Rule H. For example, the LF in (31) is blocked by its competitor (32), which violates Parallelism:

(31) *Rule H
(32) *Parallelism

Similarly, to derive reading (29c) while respecting Parallelism, [every worker] must bind the first and third pronouns in the first conjunct; and to derive reading (29d) while respecting Parallelism, [every worker] must bind the first and second pronouns in the first conjunct. In both cases, a co-binding configuration is created giving rise to a violation of Rule H.

What if we replace Parallelism with RSCC+FSV? This option is explored in Roelofsen (2011). Roelofsen points out that without further constraints, RSCC+FSV allows all of the readings (29a)–(29d) to be derived from LFs that comply with Rule H.7 RSCC+FSV is therefore too lax. Does RSCC+SFSV fare any better? I will now show that it does. The desired result is for the LFs in (33)–(35) to be allowed and for the LF in (36) to be blocked:

(33) sloppy-sloppy
(34) strict-strict
(35) sloppy-strict
(36) *strict-sloppy

All of the LFs in (33)–(36) have the property that for each assignment g, when [the boss]F is replaced by a DP X such that ⟦X⟧g = g(1), a constituent [X … knows …] is derived that has the same semantic value for g as α in the first conjunct. Modulo interference from Rule H, RSCC is therefore satisfied in all four LFs. Before considering the filtering effect of Rule H on SFSVs, let us first verify that each of (33)–(36) satisfies Rule H. It is clear that Rule H is satisfied in (33) since it contains no instances of non-local binding. In each instance of non-local binding in (34) and (35), the binder is [every worker]. In (34), binding [he1] in β by the closer potential antecedent [the boss] would clearly give rise to a distinct interpretation (the sloppy-sloppy interpretation), so this binding dependency does not violate Rule H. In (35) we can either bind [his1] in β by [the boss] or by [he4]. Each option again gives rise to a distinct interpretation (the sloppy-sloppy interpretation in both cases), so that there is no violation of Rule H. There are two instances of non-local binding in (36). Binding [he1] in β by the closer potential antecedent [the boss] would give rise to a distinct interpretation, as would binding [his4] in β by the closer potential antecedent [he1].

Let us now consider the SFSV of β. As there are no instances of non-local binding within β in (33)–(35), Rule H does not winnow the SFSV of β in (33)–(35). Ellipsis in (36) is licensed by RSCC+SFSV only if for each assignment g, one of the members of SFSVg (β) denotes the proposition ‘The worker [=g(1)] said he knows how he broke his tools.’ For each g, this proposition can be derived from β by replacing [the boss]F with some X such that ⟦X⟧g = g(1):

(37) X [λ4 does [t4 know how he1 broke his4 tools]]

The Rule H competitor for (37) is (38):

(38) X [λ4 does [t4 know how he15 [t5 broke his5 tools]]]

Since (37) and (38) have the same interpretation for all assignments, (37) violates Rule H for all assignments. Thus for all g, ⟦(37)⟧g ∉ SFSVg (β). As a result, ellipsis is not licensed in (36), correctly predicting the absence of reading (29d).

An anonymous reviewer points out that the preceding analysis of (33)–(36) is incompatible with the ‘Have Local Binding!’ (HLB) constraint of Büring (2005), reproduced in (39). Büring intends this constraint to replace the conjunction of Rule H and Rule I.8

(39) Have Local Binding! (HLB)
  For any two NPs α and β, if α could semantically bind β (i.e. if it c-commands β and β is not semantically bound in α’s c-command domain already), α must semantically bind β, unless that changes the interpretation.

HLB is violated in the syntactic structure underlying the member of the FSV of β required to license ellipsis in (35). This is the structure derived using [he1] as the alternative to [the boss]F:

(40) [he1] [λ4 does [t4 know how he4 broke his1 tools]]

HLB is violated in (40) because [he4] is a potential binder for [his1], and binding [his1] by [he4] yields the same interpretation. This example shows that HLB is a stronger constraint than the conjunction of Rule H and Rule I. Rule H is not violated within (40) because [his1] is not bound within (40). Rule I is not violated by [he4] and [his1] in (40) because Rule I regulates coreference, and [he4] is bound within (40).9 Fortunately, the additional strength of HLB is of no consequence with regard to the phenomena Büring discusses. That is, all of his arguments still go through if Rule H and Rule I are separate constraints — as his exposition initially assumes.


My analysis of the Dahl paradigm is similar in spirit to that of Kehler & Büring (2008). K&B introduce the ‘Be Bound or Be Disjoint’ constraint in (41). This constraint resembles Rule H insofar as it blocks co-binding and binding across a coreferential expression.

(41) Be Bound Or Be Disjoint (BBOBD) Kehler & Büring (2008)
  If a pronoun p is free in the c-command domain of a (non-Wh) DP α, p bears a presupposition of disjointness with α (unless α binds p).

K&B’s analysis makes reference not only to the structures containing the antecedent and elided VPs, but also to an additional syntactic structure, the ‘Question Under Discussion’ (QUD). Informally, a QUD for a given discourse is a question that its participants are concerned to ask or answer. As syntactic structures, QUDs are subject to BBOBD.

Consider the discourse in (42). The LF (42b) yields a strict-strict reading of the pronouns in the elided VP. Ellipsis is licensed in (42b) iff there is a BBOBD-respecting QUD to which both (42a) and (42b) are answers. There is in fact such a QUD — (43) — so ellipsis is licensed.

(42) a. John [λ2 [t2 thinks he23 [t3 loves his3 wife]]]]
  b. Bill does [VP think John [λ4 [t4 loves his4 wife] too]]
(43) Who thinks John [λ1 [t1 loves his1 wife]]

A different QUD is required to license ellipsis for the forbidden strict-sloppy reading:

(44) a. John [λ2 [t2 thinks he23 [t3 loves his3 wife]]]]
  b. Bill4 does [VP think John loves his4 wife] too]]
(45) Who [λ1 [t1 thinks John loves his1 wife]]

The QUD in (45) is ‘Who is such that they think John loves their wife?’ BBOBD adds the presupposition that [his1] is disjoint in reference from [John]. As a result, (44a) is not an appropriate answer to (45). Thus, (45) is not a QUD to which both (44a) and (44b) are answers, and ellipsis in (44b) is not licensed.

In the examples K&B consider, it is clear enough what the question under discussion is. However, in cases such as the embedded Dahl paradigm, where the elided VP and its antecedent contain pronouns bound by the same higher quantifier, it is less clear what the QUD responsible for licensing ellipsis should be. In the case of (28), for example, is there a single QUD which receives multiple answers (one for each worker in the domain), or is there a different QUD for each worker? To deal adequately with these examples, a significant amount of technical work would have to be done to clarify the notion of a QUD and explain how exactly the QUD is derived from the context together with the other syntactic structures present. An advantage of the present theory is that it makes reference only to syntactic structure of the sentence itself (modulo substitution of focus alternatives), and relies on concepts from Rooth’s theory of focus that have already been thoroughly investigated.


A number of analyses of the Dahl paradigm tie it to VP ellipsis. For example, the analysis of Schlenker (2005: 33–37) crucially depends on there being two independent pairs of pronouns, one in the antecedent VP and one in the elided VP. Kehler & Büring (2008) make the important point that the same pattern of available and unavailable interpretations also shows up in non-ellipsis contexts:

(46) Mary [only [VP told JohnF that he loves his mother]].
  a.   John is the only x s.t. Mary told x that x loves x’s mother.
  b.   John is the only x s.t. Mary told x that John loves John’s mother.
  c.   John is the only x s.t. Mary told x that x loves John’s mother.
  d. *John is the only x s.t. Mary told x that John loves x’s mother.

My analysis can be extended to the paradigm in (46). The starting point is Rooth’s (1992) analysis of adverbial only. Consider reading (46d). This reading requires [his] to be bound by [JohnF] and [he] to be interpreted as a referential pronoun. An LF along these lines is shown in (47):

(47) Mary [[only C] [[VP told [John1]F2 [t2 that he1 loves his2 m.]]] ∼ C]]
(48) ⟦only⟧ = λC⟨⟨e,st,t⟩ λP⟨e,st⟩ . ∀Q⟨e,st⟩, [Q ∈ C ∧ Q(x) → P = Q]
(49) P⟨e,st⟩ = Q⟨e,st⟩ iff ∀xw, [P(x)(w) = Q(x)(w)])

The property denoted by the matrix VP (‘told John that John loves John’s mother’) is not a member of the SFSV of the matrix VP, since when [John] is taken as the alternative to [John]F, binding of [his] by [John] across a pronoun coreferential with John violates Rule H. Thus, this property is not a member of C when the denotation of (47) is computed via application of the denotation of only in (48). The denotation of (47) is shown in (50):

(50) ∀Q⟨e,st⟩ [Q ∈ C ∧ Q(Mary) → P = Q]
  where P = λxe λw . x told John that John loves John’s mother in w

As P is not a member of C, no member of C can equal P. It follows that (50) is true iff there is no property in C which holds of Mary. Rooth’s ∼ operator introduces a presupposition:10

(51) φ ∼ Γ, evaluated for an assignment g, presupposes that
  i. Γ is a subset of SFSVg(φ),
  ii. SFSVg(φ) contains ⟦φg, and
  iii. SFSVg(φ) contains an element distinct from ⟦φg.

Thus, (47) presupposes that C contains the property ‘told John that John loves John’s mother’ (via ii), but also that C does not contain this property (via i, where Γ in this instance is C). The inaccessibility of reading (46d) then follows from (47) having an unsatisfiable presupposition.

The available readings of (52) follow the same pattern as those of (46). If the LF constituency of these examples is as shown in (53a), the analysis in section 7 carries over without modification. If, however, the constituency is as shown in (53b), then that analysis does not carry over.

(52) Only JohnF said he loves his mother.
(53) a. Only [JohnF said he loves his mother].
  b. [Only JohnF] said he loves his mother.

The unavailable reading corresponding to (46d) can be derived from the following LF:

(54) [Only [John1]F] [λ2 [t2 said he1 loves his2 mother]]

Here, [only [John1]F] QRs and binds a type e trace. An ordinary two-place denotation for only suffices for interpretation. Roughly, (54) asserts that for every member x of SFSVg ([JohnF]), if ⟦[λ2 [t2 said he1 loves his2 mother]]⟧g(x) = 1, then x = ⟦[[John1]F]⟧g = John. On the present analysis the only SFSV computed is the SFSV of [John]F. As a result there is no opportunity for Rule H to winnow the set of alternatives, so that the unattested reading in (54) is let through. I return to this issue at the end of the next section.


Roelofsen (2011) points to the problem posed for Rule H by examples like (55):11

(55) Every student said he loved his essay.
  [No student]F said [the teacher]F did.
  ‘No student said the teacher loved the student’s essay.’

We are interested in readings of the first sentence where the pronouns are bound (directly or indirectly) by [every student]. Ellipsis is then licensed only if the denotation of (56) is a member of the FSV of the second sentence. The LF in (56) can be derived by by replacing [no student]F with [every student] and [the teacher] with [he1]. However, (56) is a co-binding structure that violates Rule H. If LFs that violate Rule H cannot contribute to FSVs, then ellipsis in (55) is not licensed.

    1. (56)

On the face of it, the only way to ensure that ellipsis is licensed in (55) is to replace Rule H with a constraint (Free Variable Economy) that freely permits co-binding. Fox’s account of Heim’s (1998) exceptional co-binding examples is thereby lost — a price that may be worth paying if these examples succumb to some other analysis.12 Roelofsen (2011) takes this tack, replacing Rule H with a constraint that freely permits co-binding while still blocking binding over a coreferential expression.

This section develops an analysis of (55) that requires only a slight relaxation of the restrictions imposed on co-binding by Rule H. Significantly, this relaxation does not let in LFs such as (14) or (15). The starting point is K&B’s idea of formulating the BBOBD constraint in presuppositional terms. Recall that BBOBD annotates certain LFs with disjointness presuppositions. We can similarly define the variant of Rule H in (57). This principle adds to each lambda phrase the following presupposition: for each argument x that the lambda phrase is applied to, the resulting value differs from the result of applying the lambda phrase’s local binding competitor to x. The presupposition is introduced via a syntactic operator ⚬. Like Rooth’s ∼ operator, the ⚬ operator introduces a presupposition without otherwise altering the denotation of its adjoint phrase.13

(57) Presuppositional Rule H (replaces Rule H)
  If a phrase φ of the form [λi …] has the local binding competitor ψ, then adjoin ⚬v to φ, assigning the variable v the value λg . ⟦ψg.
(58) ⟦⚬⟧g(v)(f) = λx : f(x) ≠ v(g)(x) . f(x)
(59) Local binding competitor
  A constituent of the form [λiCBi], where ellipses indicate c-command, has a local binding competitor [λiCj [tjBj]]]
  (where B is a pronoun and j does not appear elsewhere in the structure).

The application of (57) is illustrated in (60), a case of illicit non-local binding over a coreferential pronoun:

        where v ↦ λg . ⟦[λ2 [t2 said he13 [t3 loves his3 mother]]]]⟧g
  ⟦[⚬v …]⟧g =
        λx : ⟦[λ2 [t2 said he1 loves his2 mother]]⟧g(x) ≠ v(g)(x)
              .⟦[λ2 [t2 said he1 loves his2 mother]]⟧g(x)

The presupposition introduced by the ⚬ operator is satisfied iff xJohn (i.e. iff [he] does not refer to John). In this way, what was formerly a direct violation of Rule H becomes an instance of presupposition failure.

Let us now consider the interaction of Presuppositional Rule H with focus. If an LF’s presuppositions are not satisfied then — depending on the theoretical treatment of presuppositions — its denotation is either undefined, or it denotes some special indeterminate value. In the former case the LF’s denotation cannot be included in an FSV. In the latter case, its denotation may be included in an FSV, but will not match any putative antecedent or attested interpretation. Thus, Presuppositional Rule H acts as a filter on FSVs. It is not longer necessary to explicitly filter out Rule-H-violating members of FSVs.

An important question now arises. Following substitution of alternatives for focused phrases, does Presuppositional Rule H apply anew to each LF underlying a member of an FSV? That is, is the value of the variable associated with each ⚬ operator updated following substitution? Or is it left unaltered? This makes no difference in the examples we have seen prior to this section, where any focused constituents always lie outside the lambda phrase of interest. In (61), for example, replacing [Bill]F with another DP does not alter the lambda phrase. But consider (62), the LF required to license ellipsis in the second sentence of (55). Here, [the teacher]F is within the lambda phase. For ellipsis to be licensed in (55), the FSV of (62) must contain the denotation of (63). The LF in (63) is derived from (62) by replacing [no student]F with [every student] and [the teacher]F with [he1]. If Presuppositional Rule H applies anew in (63), reassigning v, then then the presupposition of the lambda phrase is derived from the value of v shown in (63b). If, on the other hand, the value of v remains unaltered, as shown in (63a), then the presupposition of the lambda phrase remains the same as in (62).

  ⟦[⚬v …]⟧g =
        λx : ⟦[λ2 [t2 said he1 did love his2 essay]]⟧g(x) ≠ v(g)(x)
              . ⟦[λ2 [t1 said he1 did love his2 essay]]⟧g(x)
  where v ↦ λg . ⟦[λ2 [t2 said he13 [t3 loves his3 mother]]]]⟧g
  ⟦[⚬v …]⟧g =
        λx : ⟦[λ1 [t1 said [TT]F did love his1 essay]]⟧g(x) ≠ v(g)(x)
              . ⟦[λ1 [t1 said [TT]F did love his1 essay]]⟧g(x)
  where v ↦ λg . ⟦[λ1 [t1 said [TT]F2 [t2 did love his2 essay]]]]⟧g
  ⟦[⚬v …]⟧g =
        λx : ⟦[λ1 [t1 said he1 did love his1 essay]]⟧g(x) ≠ v(g)(x)
              . ⟦[λ1 [t1 said he1 did love his1 essay]]⟧g(x)
  a. v ↦ λg . ⟦[λ1 [t1 said [TT]F2 [t2 did love his2 essay]]]]⟧g
  b. v ↦ λg . ⟦[λ1 [t1 said he12 [t2 did love his2 essay]]]]⟧g

If v is valued as in (63a), then the presupposition introduced by ⚬ in (63) is satisfied for every student x, so that ⟦(63)⟧ is admitted to the FSV of (62), and ellipsis is licensed in (55) — as desired. Conversely, if v is valued as in (63b), then the presupposition introduced by ⚬ is not satisfied for any value of x, ⟦(63)⟧ is not admitted to the FSV of (62), and ellipsis is not licensed in (55).14 On empirical grounds, I therefore hypothesize that Presuppositional Rule H does not apply anew to each focus alternative. Rather, in the computation of an FSV, presuppositions introduced by Presuppositional Rule H are left unaltered after alternatives are substituted for focused phrases.

To summarize, we have seen that Presuppositional Rule H makes it possible to relax restrictions on co-binding in just the required instances. These are instances where the presence of a focused constituent within a lambda phrase causes one of the focus alternatives to the lambda phrase to be filtered out by Rule H.

At this point it is worth noting that Fox’s motivation for restricting co-binding comes from the SCO and Condition B phenomena mentioned in sections 2.2–2.3. Fox’s analysis of the Dahl paradigm would still go through if co-binding were freely permitted — and the analysis of (55) would be more straightforward. In light of these observations, Roelofsen (2011) defines a variant of Rule H, Free Variable Economy. This constraint blocks binding across a coreferential expression but not co-binding. One could easily replace my derivatives of Rule H with parallel derivatives of Free Variable Economy. The cost of doing so would be the loss of Fox’s account of the aforementioned Condition B and SCO phenomena. I leave it as an open question whether this is a price worth paying.15

There is one further application of Presuppositional Rule H. Recall (52), the problematic case from section 7, with the constituency indicated in (53b). The candidate LF for the missing strict-sloppy reading is shown in (64), following application of Presuppositional Rule H.

(64) *[Only [John1]F] [⚬v2 [t2 said he1 loves his2 mother]]]
    where v ↦ λg . ⟦[λ2 [t2 said he13 [t3 loves his3 mother]]]]⟧g

The prejacent of (64) is derived by applying ⟦[⚬v …]⟧g to ⟦[John1]Fg:

(65) Assertive component of prejacent of (64):
              John said John loves John’s mother.
  Presupposition of prejacent of (64):
              {w | J said J loves J’s mother in w}
                          ≠ {w | J said J loves J’s mother in w}

The presupposition introduced by ⚬ is not satisfied in this case. Presupposition failure in the prejacent generally gives rise to deviance. In (66), for example, the gender presupposition introduced by the pronoun is not satisfied in the prejacent:

(66) #Only this man knows her stuff.
    Prejacent: #This man knows her stuff.

The inaccessibility of the strict-sloppy reading of (64) may therefore be a consequence of presupposition failure in its prejacent.


If Rule H acts as a filter on focus semantic values, then Fox’s analysis of the Dahl paradigm can be recast without appeal to his disjunctive Parallelism constraint. Since problems with Parallelism have led a number of authors to reject Fox’s analysis, this is a welcome result. No additional assumptions are required to handle the embedded Dahl paradigm (which Fox’s analysis fails to account for). The revised analysis also offers some insight into non-ellipsis variants of the Dahl paradigm, which are unexpected on many alternative analyses.

An important unresolved question is whether Fox was on the right track in formulating Rule H so as to tightly restrict co-binding. It seems clear that co-binding is not quite as restricted as Fox assumed. But is it possible to let in the co-binding structures empirically required to license certain instances of VP ellipsis without also letting in the structures that threaten to complicate the formulation of Condition B? I have argued in section 8 that co-binding can be constrained in exactly this way. The key to doing so is a presuppositional formulation of Rule H.


The additional file for this article can be found as follows:


Appendix to Remarks on Rule H. DOI: https://doi.org/10.5334/gjgl.333.s1


  1. For critical commentary on Fox’s formulation of Parallelism see e.g. Heim (2008); Roelofsen (2010); Reinhart (2006). The principal objections in the literature relate to Fox’s disjunctive definition of Parallelism in terms of separate Referential Parallelism and Structural Parallelism constraints. Fox’s analysis of the Dahl paradigm does not in itself require this disjunctive definition, since Structural Parallelism is sufficient to license all of the available readings (on the assumption that each pronoun in the antecedent VP can be interpreted as either a referential or a bound pronoun). However, certain other phenomena, such as the ability of a single antecedent VP to license both strict and sloppy ellipsis (Fiengo & May 1994, 169–171; Fox 2000, 117), suggest that not all instances instances of VP ellipsis satisfy Structural Parallelism. Thus an additional licensing mechanism, in the form of Referential Parallelism, appears to be required. A Rooth-style contrast constraint on VP ellipsis, which is the basis of the present analysis, straightforwardly accounts for the ability of a single antecedent VP to license both strict and sloppy ellipsis. [^]
  2. I annotate LFs with links to make it easier to discern patterns of binding dependencies. These links never convey any information that the LFs themselves do not. [^]
  3. A semantically binds B iff A and B are in the configuration [AiBi]] and the indicated co-indexation is non-vacuous. [^]
  4. If Condition B restricts only semantic binding then it can also be sneaked around using coreference. This paper takes no position on how to solve this problem. One possible solution is the addition of a constraint on the use of coreference such as Grodzinsky & Reinhart’s (1993) Rule I. Büring (2005) proposes to collapse Rule H and Rule I into a single ‘Have Local Binding!’ (HLB) constraint. See section 5 for discussion of HLB. See also Heim (1998), Reinhart (2006), Heim (2008), Reuland (2010), Roelofsen (2010) for different perspectives on the correct formulation of Condition B and pertinent economy conditions (if any). [^]
  5. The constraint in (17) is not the only constraint on VP ellipsis according to Rooth. There is also a matching constraint on the syntactic form and lexical content of the antecedent and elided VPs. [^]
  6. No Meaningless Coindexing: If an LF contains an occurrence of a variable v that is bound by a node α, then all occurrences of v in this LF must be bound by the same node α. See Roelofsen (2011) for commentary on this constraint in the context of the Dahl paradigm. [^]
  7. Roelofsen makes this point with regard to his proposed replacement for Rule H, Free Variable Economy, but the point extends to Rule H. [^]
  8. Rule I: NP A cannot corefer with NP B if replacing A with C, C a variable A-bound by B, yields an indistinguishable interpretation. Grodzinsky & Reinhart (1993: 88) [^]
  9. Reinhart and Grodzinsky do not define coreference precisely, but coreference — in the tradition of Reinhart (1983) and subsequent work — is a relation between two referential expressions, and hence not a relation that a bound variable can enter into. There may be a Rule I violation in (40) in virtue of [he1] being in a configuration to bind [his1]. However, binding [his1] by [he1] would give rise to a co-binding LF that violates Rule H. One could hypothesize that Rule I considers only those alternative LFs that do not violate other economy conditions. [^]
  10. I have modified the original definition in Rooth (1992) by replacing ‘Focus Semantic Value’ with ‘SFSV’, by numbering the three clauses for ease of reference, and by adding the reference to an assignment. [^]
  11. Roelofsen uses examples where the matrix verb in the second sentence differs from the matrix verb in the first sentence. On the Roothean account of VP ellipsis licensing assumed here, the matrix verb in the second conjunct would also have to be focused for ellipsis to be licensed in Roelofsen’s examples. Since this additional focus is irrelevant to present concerns, I use examples where the matrix verb is identical in both sentences. From Roelofsen’s point of view, what these examples show is that a co-binding structure must be available in the first sentence of (55). [^]
  12. See e.g. Heim (2008). [^]
  13. The definition in (57) doesn’t handle the case where a phrase has multiple local binding competitors. In this case, an operator ⚬vn is adjoined for each local binding competitor ψn (the order of adjunction being immaterial). For example, if a phrase φ has two local binding competitors ψ1 and ψ2, then the output of Presuppositional Rule H is [⚬v2 [⚬v1 φ]], with each vn assigned the value λg . ⟦ψng. There are many ways to cash out assignment of a value to the variable argument of ⚬. For example, one could use indexed variables v1vn, define ⟦vκg = g(κ), and have (57) update Cg with the assignment κ ↦ λg . ⟦ψng. [^]
  14. ∀g⟦[λ1 [t1 said he1 did love his1 essay]]⟧g = ⟦[λ1 [t1 said he12 [t2 did love his2 essay]]]]⟧g. [^]
  15. For critical comments on Fox’s analysis of SCO, see Roelofsen (2008), who argues that SCO and WCO should have a unified analysis. The literature on exceptional co-binding has ballooned in the decades since Heim (1998) (circulated as a working paper in 1993). There has been a degree of controversy over the judgments for these cases, and for the related examples involving coreference from Reinhart (1983). On this point see e.g. Schlenker (2005); Grodzinsky & Sharvit (2007); Heim (2008); Jacobson (2008); Roelofsen (2010). An empirical study of some of the relevant judgments can be found in McKillen (2016). [^]


Three anonymous reviewers made detailed and insightful comments that have led to many substantial improvements. An earlier version of this work was presented at the Syntax Reading Group at University College London in February 2016. I would like to thank the audience for their comments and questions, in particular Klaus Abels, Annabel Cormack, Patrick Elliott, Hans van der Koot and Yasutada Sudo. I would also like to thank Kyle Johnson for comments on early versions of the idea presented here.


The author has no competing interests to declare.


Bach, Emmon & Barbara H. Partee. 1980. Anaphora and semantic structure. In Jody Kreiman & Almerindo E. Ojeda (eds.), Papers from the parasession on pronouns and anaphora. 1–28. Chicago, Illinois: University of Chicago.

Büring, Daniel. 2005. Bound to bind. Linguistic Inquiry 36(2). 259–274. DOI:  http://doi.org/10.1162/0024389053710684

Dahl, Östen. 1973. On so-called sloppy identity. Synthese 26. 81–112. DOI:  http://doi.org/10.1007/BF00869757

Dahl, Östen. 1974. How to open a sentence: Abstraction in natural language. In Logical grammar reports, vol. 12. University of Götenburg.

Fiengo, Robert & Robert May. 1994. Indices and identity. Cambridge, Massachusetts: MIT Press.

Fox, Danny. 1998. Locality in variable binding. In Pilar Barbosa, Danny Fox, Paul Hagstrom, Martha McGinnis & David Pesetsky (eds.), Is the best good enough?, 129–156. Cambridge, Massachusetts: MIT Press.

Fox, Danny. 2000. Economy and semantic interpretation. Cambridge, Massachusetts: MIT Press.

Grodzinsky, Y. & Y. Sharvit. 2007. Coreference and self-ascription. Unpublished ms., McGill University.

Grodzinsky, Yosef & Tanya Reinhart. 1993. The innateness of binding and coreference. Linguistic Inquiry 24(1). 69–102.

Heim, Irene. 1997. Predicates or formulas? Evidence from ellipsis. In Aaron Lawson & Enn Cho (eds.), Proceedings of SALT VII. 197–221. Cornell University, CLC Publications. DOI:  http://doi.org/10.3765/salt.v7i0.2793

Heim, Irene. 1998. Anaphora and semantic interpretation: A reinterpretation of Reinhart’s approach. In Orin Percus & Uli Sauerland (eds.), The interpretive tract (25), 205–246. Cambridge, Massachusetts: MIT, Department of Linguistics.

Heim, Irene. 2008. Forks in the road to Rule I. In Anisha Schardl, Martin Walkow & Muhammad Abdurrahman (eds.), Proceedings of NELS 38. CreateSpace. DOI:  http://doi.org/10.1007/BF03225207

Heim, Irene & Angelika Kratzer. 1998. Semantics in generative grammar. Malden, MA: Blackwell.

Jacobson, Pauline. 2008. Direct compositionality and variable-free semantics: the case of Antecedent Contained Deletion. In Kyle Johnson (ed.), Topics in ellipsis, 30–68. Cambridge, United Kingdom: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511487033.003

Kehler, Andrew & Daniel Büring. 2008. Be bound or be disjoint! In Anisha Schardl, Martin Walkow & Muhammad Abdurrahman (eds.), Proceedings of NELS 38. 487–501. CreateSpace.

McKillen, Alanah. 2016. On the interpretation of reexive pronouns. Montréal: McGill University dissertation.

Reinhart, Tanya. 1983. Coreference and bound anaphora: A restatement of the anaphora questions. Linguistics and Philosophy 6. 47–88. DOI:  http://doi.org/10.1007/BF00868090

Reinhart, Tanya. 2006. Interface strategies. MIT Press. DOI:  http://doi.org/10.7551/mitpress/3846.001.0001

Reuland, Eric. 2010. Minimal vs. not so minimal pronouns. In Martin Everaert, Tom Lentz, Hannah de Mulder, Øystein Nilsen & Arjen Zondervan (eds.), The linguistics enterprise: From knowledge of language to knowledge of linguistics, 257–281. John Benjamins.

Roelofsen, Floris. 2008. Anaphora resolved. Amsterdam: University of Amsterdam dissertation.

Roelofsen, Floris. 2010. Condition B effects in two simple steps. Natural Language Semantics 18(2). 115–140. DOI:  http://doi.org/10.1007/s11050-009-9049-3

Roelofsen, Floris. 2011. Free Variable Economy. Linguistic Inquiry 42. 692–697. DOI:  http://doi.org/10.1162/LING_a_00066

Rooth, Mats. 1985. Association with focus. Amherst, MA: University of Massachusetts dissertation.

Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1(1). 117–121. DOI:  http://doi.org/10.1007/BF02342617

Schlenker, Philippe. 2005. Non-redundancy: Towards a semantic reinterpretation of Binding Theory. Natural Language Semantics 13(1). 1–92. DOI:  http://doi.org/10.1007/s11050-004-2440-1