1 Introduction

1.1 Overview

This paper is largely concerned with better understanding the structure and interpretation of an extremely common but understudied why-question in English, for which I will use the descriptive term Why-VP. A naturally occurring example of Why-VP is in (1).

(1) Why take Structure of Japanese?

Why-VP can also take the form seen in (2), in which not intervenes between why and the verb phrase:

(2) Why not take the bus today?

What is immediately striking about such structures as (1) and (2) is their reduced quality – there are no auxiliaries, no Subject-Aux Inversion (SAI), no tense marking, nor is there an overt subject.1 Importantly, questions of this form only occur with why in English.

(3) *Who/*What/*Where/*When/*How/*How come/Why leave?2

A finite why-question, a term I will adopt to draw a comparison throughout, is one that resembles (4) and is part of a larger class of wh-questions, where the choice of tense does not constrain the availability of other wh-phrases:

(4) Why did you take Structure of Japanese?
(5) a. Who/What did you take today?
  b. How/When/Where did you take Structure of Japanese?3

The interpretation of both (1) and (2) and all examples of the same type are consistently and crucially modal, best paraphrased with should or would – it will generally be odd to paraphrase (1) with (4), which conveys past tense. The departure in form and interpretation from finite why-questions suggests that the syntax of (1) and (2) should also be distinct from that which is generally assumed for (4).

Despite the frequency with which Why-VP occurs in natural speech and the striking differences which set it apart from finite wh-questions, the analytic and theoretical issues raised by the construction have not been much investigated. In this paper I show that the core properties of the structure emerge in a routine way from standard compositional mechanisms (syntactic and semantic) that are well established and well understood.

1.2 Methodology

The observations that drive my analysis and conclusions derive from a corpus of naturally occurring examples found in the New York Times portion of the Gigaword corpus (Graff & Cieri 2003). That corpus in turn emerged as a by-product of the work of the Santa Cruz Ellipsis Project (The Implicit Content of Sluicing, NSF Project, #1451819). The project uncovered 252 examples of Why-VP (more or less by accident) and since each instance includes its full discourse context (preceding and following) the corpus allows us to investigate the extent to which, and the ways in which, Why-VP is sensitive to that context. This fact will be especially crucial when we consider the interaction with ellipsis in Section 3. It is also crucial when we consider the illocutionary force of Why-VP – one of the central themes of previous work on the topic.

Examples from this corpus are labeled ‘NYT’ plus some unique number identifier. This corpus, in its entirety, will be provided as a supplementary file at the end of the paper. Other naturally occurring examples were found via informal Google searches, whose links will be provided in the footnotes. Note that some of these links may not be stable. This corpus provides the empirical foundation for the investigation but I have not hesitated to supplement that data with other found examples and also with informal acceptability studies of the kind traditionally used in syntactic work.

2 The phenomenon: Previous work and core observations

The limited body of work on Why-VP treats it as an eccentric construction – one in which the syntactic frame is linked conventionally with its observed properties. These discussions have been concerned with its apparently directive force and with its reduced syntax. I review those accounts in this section with a view to showing that the properties for which they try to account are less eccentric and less idiomatic than they assume. That conclusion will then set the scene for my proposals about the syntactic, semantic and pragmatic composition of Why-VP structures in section 4, 5, and 6.

2.1 Early approaches

Early discussions on Why-VP largely focused on how it fits in to a theory of speech acts (Sadock 1974; Green 1975; Searle 1975) and by questions having to do with the interaction between illocutionary force and syntactic structure. Such discussions were initiated by Gordon & Lakoff (1971), who endeavored to link the pragmatic function of Why-VP to its syntax, arguing that a transformation they called TENSE-DELETION must be sensitive to the conversational import of an utterance. They want to relate the following two examples by way of this transformation:

(6) a. Why do you paint your house purple?
  b. Why paint your house purple?

They claim that (6a) is ambiguous in its conversational import between conveying an actual request for information and conveying the suggestion that one should not paint their house purple. However, (6b) is unambiguous in necessarily conveying a suggestion. If the why-question is in the form seen in (6a) and is used to make a suggestion, then TENSE-DELETION can apply and derive (6b) from (6a), whereby do you is deleted.

Gordon & Lakoff’s analysis is able to account for the interesting restriction below, observed by Johnson (1975):

(7) a.   Why do you have big feet?
  b. *Why have big feet?
  c.   Why do you seem so tall?
  d. *Why seem so tall?
  e.   Why do you resemble your father?
  f. *Why resemble your father?

If Why-VP necessarily makes a suggestion, as Gordon & Lakoff claim, then verbs like those above, which do not assign an agent theta-role to the subject position will be ruled out, as their underlying representations are unlikely to have the conversational import of a suggestion. In other words, it would be pragmatically odd to suggest a course of action of which the subject cannot be an agent.

However, Johnson observes that theta-role assignment may not be the determining factor here; verb choice, alone, does not contribute this restriction in Why-VP. The same verb will be (un)- acceptable depending on some independent factor. Consider the following:

(8) a. #Why suffer the insults?
  b.   Why suffer the insults silently?4

The manner adverb in (8b) allows us to construe the addressee as having control over a particular part of the event, thus Why-VP here is completely acceptable as opposed to (8a), which seems degraded.5

Searle (1975) discusses the speech act status of Why-VP independently of these discussions, acknowledging also that this pseudo-agentive restriction exists, drawing a comparison between Why-VP and imperative structures, which he maintains are both similar in interpretation and in form, but that Why-VP should not itself be taken to be inherently imperative, and that it retains a literal question meaning in addition to having a suggestive or imperative force. Under Searle’s view, Why-VP structures are actually ambiguous and can be used as real information-seeking questions, contra Gordon & Lakoff, whose analysis depends on Why-VP having the conversational import of a suggestion.

An interesting point which comes to bear on the discussion above is the status of the subject in Why-VP. Johnson observes the implicit subject of Why-VP is not necessarily 2nd person pronoun, noting contexts in which something more generic, like anyone, is a better interpretation for the subject:

(9) Context: I read in the paper this morning that the mayor was busted at the skin flicks last night.
  a.   I can understand a lot of things but, why go to the skin flicks?
  b. #Why do/would you go to the skin flicks?
  c.   Why would anyone/one go to the skin flicks?

Further, reviewers of Johnson’s paper suggest that 3rd and 1st person interpretations are possible given the acceptability of reflexives with these features in Why-VP:

(10) a.   Why not get ourselves a nice farm?
  b.   I could go to a demonstration, but why get myself arrested?
  c. ?They/He/She could skip class, but why get themselves/himself/herself in trouble?
  d. ?The computer could throw the chess game, but why get itself a bad reputation?

While Johnson is unsure about the judgments for (10d) and (10d), we will see in Section 4 naturally occurring data from the corpus which suggests that certain contexts do license third person human reflexives, and all seem quite natural in those contexts.6 This observation casts doubt on the claim that Why-VP are always used as suggestions – if they were, we might expect them to always be addressee-oriented, or have interpretation of a 2nd person pronoun. Given (10) above, this does not seem to be the case.

Johnson posits, however, that the real issue with TENSE-DELETION is that the presuppositional content of finite why-questions don’t hold in Why-VP. Consider:

(11) a. Why are you kind to your husband?
  b. Why be kind to your husband?

In (11a), it is presupposed that you are kind to your husband, while in (11b), no such presupposition is imposed. If (11b) is to be derived from (11a) then they should have identical interpretation and should show the same presuppositional content.

Johnson (1975) ultimately proposes an amendment to Gordon & Lakoff that Why-VP structures be derived from modal why-questions, particularly from why-would or why-should structures, rather than from Why-do-you questions as in (6a). This move is warranted, according to Johnson, because of the interpretation of Why-VP questions and also because it was a solution to the problem outlined above. The modal question no longer projects the presupposition. Asking why should you be kind to your husband doesn’t presuppose that the addressee should, or has the duty to be kind to their husband. The analysis I develop below will try to incorporate these important observations about modality while avoiding appeal to the disjunctive would or should and also making a connection with other expressions of modality in wh-infinitival contexts.

2.2 Why-VP as expressing a rhetorical question

Bhatt (1998), Duffley & Enns (1996), and Francez (2017) return to the rhetorical or suggestive force of Why-VP utterances first noted by Gordon & Lakoff (1971). They converge on noting certain preferences concerning the use of Why-VP, i.e., that Why-VP favors a rhetorical interpretation; however, I will provide naturally occurring data that show this is merely a preference rather than a hard constraint on interpretation. I will ultimately argue that we will better capture the observations from this line of work by way of mechanisms that are only indirectly sensitive to syntactic and semantic structure.

2.2.1 Bhatt (1998)

Bhatt (1998) argues that the rhetorical interpretation of Why-VP is essential and further that it can explain two phenomena. The first is that so-called “positive” Why-VP structures (in the absence of a negative marker) license Negative Polarity Items (NPIs), and the second is that why is restricted to a matrix clause interpretation in a bi-clausal structure, a restriction which does not hold for non-rhetorical why-questions. Bhatt claims that Why-VP structures are obligatorily rhetorical, observing that they prototypically express a command with the opposite polarity, rather than ask a question that maintains the observed polarity. For example, Bhatt will take (12a) to mean (12b):

(12) Bhatt (1998: 11–12)
  a. Why leave?
  b. =Do not leave

Given this observation, he proceeds to give a semantics for Why-VP which effectively analogizes Why-VP to imperatives, with the result that their sole or prototypical use will be to issue commands. These proposals allow an explanation of how Why-VP is able to license NPIs, while Why-not-VP examples do not:

(13) Johnson (1975: 10)
  a.   Why (even) lift a finger?
  b. #Why not (even) lift a finger?

If Why-VP is always interpreted as a rhetorical question, with reversal of the overt polarity, the interpretation of (13a) will contain negation, while the interpretation of (13b) will not. Assuming that NPIs are licensed under semantic or pragmatic conditions as Bhatt argues, Why-VP should license the strong NPI lift a finger, and should not license it in (13b).7

Secondly, Bhatt observes an interpretive restriction for Why-VP that isn’t observed in finite why-questions. He considers the following example:

(14) Bhatt (1998: 12)
  Why say that Bill was fired?
  = ¬∃ R should.say(PRO, fired(bill), R)
  ≠ ¬∃ R should.say(PRO, fired(bill, R))

The observation here is that (14) can only be interpreted such that why modifies the saying event in the matrix clause.8 Example (14) cannot assert that there is no reason Bill was fired such that PRO should say it, but it can assert there is no reason PRO should say that Bill was fired.

Bhatt attributes the unavailability of a low construal of why to what he calls the “rhetorical island effect,” observing that both Why-VP utterances and other rhetorical wh-questions seem to be unambiguous in precisely the same way, where typical wh-questions are not. Consider first the ambiguity of the non-rhetorical example below:

(15) Bhatt (1998: 12)
  Why should Ermo say that Bill was fired?
  = for what R should.say(ermo, fired(bill), R)
  = for what R should.say(ermo, fired(bill, R))

Now if we consider a rhetorical interpretation, the possibility of a low construal for why goes away:

(16) Bhatt (1998: 3)
  Why should John (ever) say that Fritz was fired?
  = ¬ ∃ R should.say(j, fired(f), R)
  ≠ ¬ ∃ R should.say(j, fired(f, R))

Bhatt’s analysis of Why-VP as obligatorily rhetorical thus allows him to unify (16) and (14).

Further, if Why-VP is necessarily rhetorical, the expectation will be that why in Why-VP should never have a low construal. This is the right prediction – as we have already seen, why is always interpreted in the matrix clause in Why-VP structures which include a complement clause. On this account, then, there is an essential connection between the forced interpretation as a rhetorical question and the impossibility of a low (embedded clause) construal for why. This means that if Why-VP can be used to express a genuine information seeking question as is claimed in Searle (1975), Bhatt would expect that a low construal for why should be possible since it is no longer rhetorical.

In fact, it seems that Searle was right to claim that Why-VP can be used as a genuine (information-seeking) question. Consider the following naturally occurring data from our corpus and beyond:

(17) (NYT 113879)
  “We really need to get measured yearly, because our bodies shift around so much,” Mitro said.
  Why worry about getting the right-size bra?
  One reason is health-related.
(18) (NYT 175174)
  “I’ve never had a boss do that.” Nor does the typical CEO work standing up as Fish does at a chest-high desk.
  Why stand?
  “It’s about energy,” said the 6-foot, 2-inch, 180-pound Fish, as he excused himself.
(19) Context: On a flyer in a clinic
  Why get vaccinated?
  Vaccination can protect older adults (and some children and younger adults) from pneumococcal disease. Pneumococcal disease is caused by bacteria that can spread from person to person through close contact. It can cause ear infections…
(20) Context: On the LSA website
  Why major in linguistics?9
  … Linguistics is a major that gives you insight into one of the most intriguing aspects of human knowledge and behavior. Majoring in linguistics means that you will learn about many aspects of human language, including sounds …
(21) ESTRIN: … Just as the U.S. has its embassy to Israel, the U.S. had its own independent mission of diplomats who met with Palestinian leaders and officials and reported back to Washington on what was happening in the Palestinian territories.
  KELLY: So why do this move now? Why close the mission to the Palestinians?10
  ESTRIN: This is the Trump administration continuing to change the symmetry in how it deals with the Israelis and Palestinians.

In (17), it seems wrong to interpret the speaker as saying do not worry about getting the right size bra, or do not stand in (18). Similarly in (19), the clinic is certainly not suggesting that we not get vaccinated, nor is the LSA in (20) trying to dissuade people from becoming linguistics majors (quite the opposite, actually). And finally, example (21) is perhaps the most clear; two Why-VP clauses occur in an interview format, a social context in which question-answer pairs are expected, and the use of Why-VP is completely natural, something we wouldn’t expect if Why-VP clauses obligatorily had a rhetorical or directive force as Bhatt suggests. The continuations of the discourse in each example above suggest also that the speaker takes the position that there is a genuine wondering, one which, importantly, matches the observed polarity of the Why-VP utterance.11

It is therefore too strong to analyze Why-VP as obligatorily rhetorical.12 This weakens the argument that the restriction to the matrix clause construal of why is a reflex of being a rhetorical question. If the restriction that why be interpreted in the matrix clause persists even when Why-VP is interpreted as a genuine question, then Bhatt’s attributing the observed restriction to the “rhetorical island effect” is incorrect. In another naturally occurring Why-VP example found via a Google search, a bi-clausal structure is present, but the overall interpretation does not appear to be rhetorical, at least in the way suggested by Bhatt. Importantly, the why must still be construed with the matrix clause predicate say:

(22) When Andy asked if she was then married when she conceived she said, “I might have been a month pregnant when I got married.” Maybe I’m missing something, but the math still doesn’t add up to me – with these dates, she would NOT have been pregnant before they married, right?
  Why say she could have been? Whatever … moving on.13

Why-VP here is not interpreted as Don’t say she could have been [pregnant]. However, it remains impossible for why to be interpreted in the embedded clause – (22) cannot mean what is the reason R such that she said that she could have been pregnant for R?

This is part of a larger pattern: even when Why-VP structures express genuine information-seeking questions, their other properties persist – the curious pattern of NPI licensing and the impossibility of an embedded construal for why. It would be a mis-step, then, to too closely link these properties with the rhetorical question interpretation – we must look elsewhere for an explanation of why they hold. At the same time, however, the ultimate analysis must provide a way of understanding the fact that Why-VP lends itself so easily to the rhetorical question interpretation and the closely linked use as a “negative suggestion.” We return to these challenges in Sections 5 and 6.

2.2.2 Duffley & Enns (1996)

Duffley & Enns assume that the syntax of Why-VP clauses is close to that of an infinitival clause, but that they crucially lack the infinitival morphology in English, to, a fact which has semantic consequences. They challenge previous claims that why + to is not possible in affirmative contexts, bringing examples found in various corpora to light:14

(23) a. Why to ban birthdays.
  b. Radio: How, When, and Why to Use it
  c. Why to vote Yes in the referendum

Despite the existence of why + to in these naturally occurring contexts, they find that why without to (Why-VP) is far more common, and differs in interpretation from the type in (23).

What they hold is that in Why-VP utterances, the absence of to reflects the attitude that the speaker sees no reason for the infinitive event to occur.15 This characterization is broad enough, I think, to account for both the genuine uses and the rhetorical uses. For those we have observed to be genuine information-seeking questions, it may well be that the speaker has assumed their audience’s attitude, and that that audience may see no reason to VP (e.g., you may be wondering… why major in linguistics). Importantly, in that case, the speaker still does not intend to instruct anyone not to major in linguistics but may ask assuming the attitude that there is no apparent reason one should major in linguistics. This characterization also accounts for the rhetorical, or imperative-like uses observed by Bhatt. As an anonymous reviewer points out, when the speaker signals that they see no reason for the infinitive event to occur, it can produce the effect of directing the hearer away from some course of action, in virtue of implying the action to be pointless by calling into question the existence of valid motives for doing it.

However, I will differ from Duffley & Enns by suggesting that the locus of this is a covert modal likely present in the composition of Why-VP rather than the absence of to. Modal paraphrases with should and would appear to have a very similar effect on the pragmatic context and in conveying the attitude expressed aptly by Duffley & Enns. That said, we will see some reason (in Section 6) to think that Why-VP should be identified with infinitival structure more broadly, in roughly the way Duffley & Enns suggest.

2.3 Why-VP as a root phenomenon

One novel observation about the distribution of Why-VP I’d like to make at this point is that Why-VP structures are crucially and stubbornly root questions. They cannot be embedded even under predicates which otherwise routinely select wh-interrogatives:

(24) a. *I (don’t) know why take Structure of Japanese.
  b. *I wonder why take Structure of Japanese.
  c. *I decided why take Structure of Japanese.
  d. *I figured out why take Structure of Japanese.

It is worth noting also that why to is incredibly marginal in English, and in a basic context, has been deemed with much consistency as unacceptable:16

(25) *I don’t know/wonder/decided/figured out why to take Structure of Japanese.17

This is a point with which Shlonsky & Soare (2011) and Barrie (2007) have grappled. Both attribute this restriction to why being merged high in the structure, therefore determining a phrase of a larger and higher type than question-embedding verbs normally select. I will argue that selection plays a role in blocking the Why-VP examples in (24) as well, but the implementation will differ. We will return to this important issue in Section 5.

2.4 Interim summary

The theme that runs through all of these studies is that Why-VP is fundamentally different in both interpretation and structure from finite why-questions. For example, the why of Why-VP does not presuppose its prejacent (Johnson 1975; Francez 2017). Further, Why-VP does not seem to correspond to a past or present interpretation, but is rather modal (Johnson 1975; Bhatt 1998; Duffley & Enns 1996; Francez 2017). Obviously, finite why-questions can freely be in the past or present tense, thus the absence of tense morphology on the verb in Why-VP has an irrealis effect on the interpretation as might be expected.

Unlike finite why-questions, the why of Why-VP can only be interpreted in the matrix clause of a bi-clausal structure (Bhatt 1998; Collins 1991). This property is shared with rhetorical why-questions as Bhatt points out. But we should not take that to be the explanation for the restriction in Why-VP as the restriction remains even when Why-VP is not rhetorical.

It is also observed in Bhatt (1998) that Why-VP licenses NPIs, and counter-intuitively Why-not-VP does not. This is also attributed to the polarity reversal they undergo under his analysis of these as necessarily rhetorical. As we have seen however, this cannot be the full story given that Why-VP is not necessarily rhetorical.

Finally, Johnson (1975) makes the important observation that Why-VP hosts reflexives varying in person, suggesting that the subject isn’t always addressee-oriented in its interpretation; this is an important point for any pragmatic account, especially those which have claimed that Why-VP structures are fixed in their pragmatic import as commands or suggestions (Bhatt 1998; Francez 2017). The very possibility of a reflexive also has implications for the issue of what syntactic structure underlies all of these properties – an issue we return to in detail in Section 4.

3 Against ellipsis

The most recent account of Why-VP clauses is proposed in Yoshida et al. (2015), who accommodate Why-VP as a special case of Why-stripping, argued in turn to be a type of clausal ellipsis parallel to that which was developed for sluicing (Ross 1969), and later for fragment answers (Merchant 2005). Under this view, Why-VP reflects the operation of an ellipsis process which eliminates the subject and tense because they are recoverable on the basis of an antecedent TP. The VP survives because it is focused and has therefore been raised out of the ellipsis-site. In what follows, I argue that ellipsis cannot account for the facts observed in Why-VP. If Why-VP is to be analyzed in terms of clausal ellipsis, the expectations are clear – it should show the kind of kind of deep dependence on discourse context (antecedence requirements and so on) that is the hallmark of ellipsis. The sections below argue that Why-VP shows no such dependence and thus prepare the way for a demonstration that an independent (and in fact much simpler) analysis is warranted.

3.1 Antecedence and given-ness

The weakest possible hypothesis that can be entertained about the licensing of ellipsis is that elided content must be given – roughly, entailed by the discourse context – in a sense that can be made precise in ways explored by Rooth (1992), Schwarzschild (1999), Heim (1997), Takahashi & Fox (2005), Merchant (2001), and others. However there is considerable evidence that a stricter requirement is imposed – one that requires that the elided material (or core subparts of the elided material) have an antecedent phrase with respect to which it is lexically and structurally parallel. Neither condition holds of Why-VP.

Why-VP was brought to our attention as a consequence of the project The Implicit Content of Sluicing, NSF #1451819 at UC Santa Cruz. A group of undergraduate and graduate student researchers are tasked with identifying a discourse-provided antecedent for a given instance of sluicing. This is generally a straightforward task, where the antecedent is immediately apparent, but root sluices in particular proved to be on average, more problematic. Why-VP was sometimes identified as a root sluice by the search mechanisms that identified candidate sluices for annotation. Upon encountering and discussing these examples, annotators unanimously agreed that it was impossible to identify an antecedent for Why-VP. To look for an antecedent for Why-VP essentially means to identify the missing subject, but since there is rarely a parallel VP to the VP of Why-VP, figuring out what would count as an antecedent proved to be nearly impossible.

Yoshida et al. (2015) largely base their argument that various Why-XP fragments are an instance of clausal (TP) ellipsis on observed connectivity effects between a provided context and the material they observe to be missing in said fragments. They propose that why is base generated in spec-CP, above a focus position (FocP), to which the non wh-remnant is ultimately raised. Everything below the focus position that hosts the non-wh remnant is elided. Their analysis primarily draws on examples of the type Why-DP. Their proposal is schematized in (26):

(26) Speaker A: John is eating natto.
  Speaker B: [CP Why [FocP natto1 [TP John is eating –1 ] ] ] ?

In (26), the material that is elided (crossed out) is recoverable based on the antecedent provided by Speaker A, permitting the utterance to simply be read as why natto? Following Pesetsky (1997), Yoshida et al. adopt a semantic view of the identity condition that warrants clausal ellipsis. They state the following:

(27) Recoverability (Pesetsky 1997: 342)
  A syntactic unit with semantic content must be pronounced
  unless it has a sufficiently local antecedent.

The above is essentially an earlier variant of the semantic condition on clausal ellipsis proposed for Sluicing in Merchant (2001), which requires e-GIVENNESS between an elided clause and its antecedent such that the two mutually entail one another (modulo focus marking).

Yoshida et al. argue that in (26), natto is focus marked (F-marked) and must therefore raise out of TP which is then elided. Although the focus in their paper is mainly on cases such as (26) in which the remnant is nominal, the operations illustrated in (26) are in fact meant to apply to any syntactic constituent (below TP, presumably) that may appear with why, including AP, PP, and VP.

An illustration of this line of analysis for Why-VP is presented in (28):

(28) A: Richard is shopping at New Leaf.
  B: [CP Why [FocP shop at New Leaf [TP Richard is –VP ] ] ] ?

Under this view, the tense and the subject in Why-VP are recoverable based on A’s utterance. However, Why-VP is not dependent on the apparent antecedent in (28A) in the same way that Why natto in (26B) is dependent on its antecedent in (26A). In (28B) the subject of shop is not obligatorily interpreted as Richard, nor is the progressive be necessarily preserved in the interpretation of Why-VP. A possible reading is paraphrased in (29):

(29) Why shop at New Leaf?
  = Why would anyone shop at New Leaf?

It is possible and often preferred for the subject in Why-VP to be interpreted as impersonal with universal force, rather than as referring to a specific individual (Richard in (28)) provided in an apparent antecedent. The provided present tense and imperfective aspect in the apparent antecedent is discarded in favor of a modal interpretation akin to would in this context. Crucial aspects of the interpretation, then, seem to be entirely independent of any potential antecedent.

More to the point, in naturally occurring examples of Why-VP, there is rarely a discernible or potential antecedent at all, contrary to the constructed exchanges above. Consider the attested examples presented in (30) and (31), for example, with their discourse context:

(30) (NYT 15830)
  Friends are usually the first ears to hear about an affair, said Fred Mayfield, a marriage and family therapist in Overland Park. “But I tell people to not tell their friends,” he said.
  “Why take the chance that your spouse will hear about it from someone else?”
(31) (NYT 153221)
  “Rents are lower and floor space is easier to obtain than ever before,” said Michiko Shimizu, president of SAI, a marketing consulting firm.
  “Their cost performance is much better, so why not open a few more shops?”

In some abstract sense, the subject of Why-VP in (30) is related to the mention of people in the provided context. However, there is really no discernible syntactic parallelism between people not to tell their friends in the context, and the VP of Why-VP take the chance that…, nor do these seem to stand in the kind of oppositional relation to each other that would warrant a contrastive focus interpretation, as would be predicted by the Yoshida et al. account.

The presence of possessive your in the example of (30) (bound by the implicit subject) indicates also that the nominal people from the previous context does not function as an antecedent for the implicit subject of Why-VP. Consider the oddness of (32):

(32) *Why would people take the chance that your spouse will hear about it from someone else?

Similarly, in (31) the subject seems pragmatically related to their in the clause immediately before the instance of Why-VP. But there is surely no way that Why-VP can be syntactically derived from their cost performance is much better as an antecedent.

Finally, and most strikingly, Why-VP occurs freely occurs with no preceding discourse context at all. Here, repeated, are naturally occurring examples found on flyers:

(33) a. Why take Structure of Japanese?
  b. Why get vaccinated?

In the absence of any preceding discourse at all both examples of (33) are well-formed and interpretable. This is not the case for structures generally assumed to be derived via ellipsis, like Sluicing or VP Ellipsis:

(34) a. #I don’t know why. Sluicing
  b. #They have. VPE

In sum, Why-VP (both in terms of well-formedness and in terms of interpretation) seem to function independently of any discourse context in a way that is entirely uncharacteristic of ellipsis.18

3.2 Against VP-fronting

To be maintained, the ellipsis analysis must also hold that the VP of Why-VP undergoes movement (VP-Fronting) in order to escape from the ellipsis site. I consider that possibility here. Consider (35):

(35) Yoshida et al. (2015: 55)
  A: John1 says Bill2 criticized all the members in the team.
  B: Why CRITICIZE (EVEN) HIMSELF*1/ok2 when there is someone else to criticize?

Under the ellipsis analysis it will be understood as in (36):

(36) Why [VP criticize (even) himself] John says Bill past [VP t ] ?

In (35), as in (36) the reflexive inside VP may not be bound by the matrix subject John, even though the VP which contains it, by hypothesis will have moved through the specifier position of the clausal complement, a position in which the reflexive it contains will be commanded by the matrix subject John, An apparently, analogous restriction holds of VP Preposing in the absence of ellipsis, as shown by Huang (1993) and is illustrated in (37):

(37) *Criticize himselfj Johnj said Donald wouldn’t.

If the failure in (37) is to be attributed to movement of VP, the apparently analogous observation for (36) might suggest that VP preposing also plays a role in the derivation of Why-VP structures. Such an argument is developed, briefly and tentatively, by Yoshida et al.

But there is a much more straightforward way of understanding these facts – by assuming simply that the VP of Why-VP structures does not raise at al; the VP of (36), for instance, never moves through the specifier of VP and is therefore never in a sufficiently local relation with the matrix subject to be bound by it.

Independent evidence that there is no VP fronting in Why-VP comes from the lack of island amelioration effects, which Sluicing and certain Why-Stripping (Why-DP) cases exhibit. Yoshida et al. show that there is such an effect in genuine cases of Why-Stripping, where, for instance what would otherwise be violations of the constraint banning extraction from relative clauses are ameliorated, as seen clearly in (38):

(38) A: They want to hire someone who speaks a Balkan language.
  B: Why a Balkan language?

(38B) is interpreted as Why do they want to hire someone who speaks a Balkan language? It does not mean Why does someone speak a Balkan language? However, in Why-VP, the interpretation is so restricted and the interpretation which would require movement out of the island is not available:

(39) A: They hired someone who wrote a paper about a Balkan language.
  B: Why write a paper about a Balkan language?

In the exchange above, the only possible interpretation of (39B) is Why would someone write a paper about a Balkan language? An interpretation in which the island is ameliorated is impossible: Why would they hire someone who wrote a paper about a Balkan language? (39B) supports only the reading that we would expect it to have if it involves neither ellipsis nor movement of the VP.

On this understanding, we expect the results of both (35) and (39). There is no syntactic structure that the VP raises through, so there is no intermediate scope position in (35) from which the VP could possibly reconstruct. Further, there is no island out of which the VP remnant is raising so we would not expect the meaning of the VP to include the structure outside of the island domain, i.e. the relative clause. The “correlate” in (35A) and (39A), then, is an illusion; it is by pure coincidence that the VPs match, it is not by necessity.

3.3 Weir’s amendment

A revision to the Yoshida et al. analysis was proposed in Weir (2014), who argues that FocP, which would host the VP remnant in its specifier, selects VoiceP rather than TP and that VoiceP is the target of deletion rather than TP.

Following Hacquard (2006), Weir assumes that circumstantial modality and deontic modality are generated within vP, presumably below Tense and Aspect. He notes examples in which the context licenses circumstantial interpretations which are preserved in Why-Stripping and where epistemic modality is present in the previous context, but not interpreted in Why-Stripping. I provide the deontic case in (40a), the circumstantial in (40b), and the epistemic in (40c):

(40) Weir (2014: 9)
  a.     I understand why John can (/is allowed to) access the guest account.
        Why the superuser account, though?
  b.     I understand why nuclear fission can produce heat.
        Why light, though?
  c.     [Detectives’ conference – debating possible hypotheses]
        I understand why Salander might be in Stockholm (a witness saw someone answering to her description).
    ??Why in Oslo, though? (OK: ‘why might Salander be in Oslo?’)

Weir addresses this issue by positing that no T layer exists in Why-Stripping, which means epistemic modals that scope above T are also eliminated, therefore predicting the contrasts above. An analysis like Weir’s would have the advantage of narrowing the possibility space for the modal interpretation, restricting it to root modals, which seems to be a step in the right direction. Note that in Why-VP epistemic interpretations are impossible:

(41) A: He might introduce Jenny to his parents next week.
  B: Why introduce your girlfriend to your parents so early in the relationship?

The modal in Why-VP is most saliently interpreted as would in (41), which is not epistemic. Thus, Weir’s amendment comes closer to understanding the interpretive properties of Why-VP, but inherits most of the difficulties already raised for Yoshida et al. (2015) since it is equally committed to the ellipsis component.

Additionally, the ability interpretation of can in (40a) and (40b) suggests that Why-stripping fragments permit a wider range of modal interpretations than are observed in Why-VP, as Why-VP is never interpreted as involving the modal force of can. We will see in Section 6 that we can narrow the availability of particular modal interpretations even further in adopting an analysis which treats Why-VP as having the covert modal found in infinitival clauses.

Weir’s observations, however, are very important in that they contribute a new insight – namely that Why-VP and other kinds of truncated Why-XP questions are parallel in never being interpreted as involving epistemic modal flavors. Full and finite why-questions exhibit no such restriction. Notice, though, that Weir’s insight in no way depends on the ellipsis component of the analysis developed by Yoshida et al., nor on its appeal to VP-movement. I will in fact argue below that the real mysteries in Why-VP structures center on modality and its syntactic expression, not on their clausal syntax, which is more or less routine.

4 The structural subject

The elements in Why-VP that are apparently missing (the subject and tense) can default to very particular interpretations, in ways which, as we have just seen, are not at all dependent on the availability of an antecedent. Given that these elements, as we have seen, are semantically autonomous in the sense that they are not dependent on discourse context for their interpretation, we might wonder whether they also exhibit a parallel syntactic autonomy. Put another way, is there evidence that there is syntactic structure between why and VP?

If it were the case that why were simply an adverbial adjoined to the VP, we would expect that no syntactically active subject should be detectable in Why-VP questions, since there would be no A-position which would host such a subject. In what follows, I provide evidence from binding and from raising that suggests there is an A-position in Why-VP which acts as a landing site for the structural subject, which from there, participates in binding dependencies of a familiar kind. The evidence for the presence of a structural subject is of the same kind as the evidence which suggests the same conclusion for infinitival clauses and is just as strong.

4.1 Binding

We will be concerned here mainly with Principle A, which we will understand in the simplified and conventional form seen in (42):

(42) PRINCIPLE A: an anaphor must be bound within the minimal clause that contains it, where the requirement of binding is met if the anaphor is coindexed with a nominal which c-commands it.

There are at least two kinds of anaphors that are restricted by Principle A: reflexives and reciprocals. There is evidence that both may occur in Why-VP constructions. Since Principle A is evidently satisfied in such configurations, we must conclude that Why-VP structures contain an appropriate local binder – one which c-commands the object anaphors we will soon observe.

4.1.1 Reflexives

The binding properties of the silent subject of Why-VP constructions raise a number of interesting and difficult questions. For now, we merely want to demonstrate that anaphors normally subject to condition A appear freely in Why-VP. If this is the case, we should conclude that there is a binder in such structures, a silent one that serves as an appropriate antecedent for the anaphor. I present here several naturally occurring examples of the relevant type from the NYT corpus:

(43) (NYT 10771)
  Macdonald had no ill effects from two months of radiation and six months of chemotherapy. Many patients don’t. But he still hasn’t looked at his scars.
  I figure, a professional examines me at least once a month, why make myself unhappy?
(44) (NYT 4714)
  We prefer the status quo, he said. “We prefer to stay single.”
  Why get engaged if engagement is equivalent to becoming a local government and making ourselves slaves?
(45) (NYT 86225)
  Willingness to be open about such matters can be viewed as signs of confidence, self-assurance, honesty, and self-insight.
  Why deny yourself the opportunity to demonstrate these characteristics during the hiring process?
(46) (NYT 91134)
  What’s so hard to get about Hillary choosing New York over Washington? The air is fresher. Her New York house, while grand, is almost cozy next to the White House.
  Why not excuse herself early?
(47) (NYT 26671)
  Why subject himself to listening to Shawn Michaels stand over him saying, “As far as Mike Tyson is concerned, you do anything to Shawn Michaels, I’ll knock your teeth out … and by the look of things, you can’t afford it?”
  Why subject himself to Stuttering John?
(48) In fact, my deprived guy went on just one family vacation during his childhood. That’s it. One trip in 18 years.
  It’s understandable if money had been tight, but they had the resources, so why deny themselves?19

The naturalness of such examples strongly suggests the existence of a structural subject in Why-VP.

There is of course a confound however. It is well known that there are cases in English in which anaphors are well-formed in the absence of local binders. The following observations come from a range of work; (49a) is adapted from Pollard & Sag (1992), (49b) from Kuno (1987), and (49c) from Reinhart & Reuland (1993).

(49) a. Billi said that [the rain had damaged pictures of himselfi].
  b. In heri opinion, physicists like herselfi are rare.
  c. Maxi boasted that [the queen invited Lucie and himselfi for a drink].

The above demonstrate well-formed apparent violations of Principle A. In (49a) and (49c), the anaphor is not in the same clause as its apparent binder, and in (49b), the binder of the anaphor does not command it. These are logophoric or exempt uses of reflexive pronouns – first identified by Pollard & Sag (1992) and by Reinhart & Reuland (1993). If the anaphors of (43)–(48) can be construed as logophoric the inference that there must be a structural subject to act as the binder no longer holds, since logophoric reflexives may be discourse-bound.

However, as pointed out by an anonymous reviewer, there is a way to avoid this confound. A signature property of logophoric interpretations is that the familiar pattern of complementarity between reflexive and non-reflexive pronouns no longer holds when the reflexive is logophoric. When a logophoric reflexive can be used, a non-reflexive pronoun is also possible, as seen in (50):

(50) a. Billi said that [the rain had damaged pictures of himi].
  b. In heri opinion, physicists like heri are rare.
  c. Maxi boasted that [the queen invited Lucie and himi for a drink].

Therefore if the reflexives of (43)–(48) were logophoric, our expectation would be that we could substitute non-reflexive pronouns for the reflexives while preserving well-formedness and (at least to a first approximation) the interpretation. This doesn’t seem to be the case:

(51) a. Why not excuse her early?
  b. Why subject him to Stuttering John?
  c. Why deny them?

According to my own intuitions and those of the anonymous reviewer, there is a strong disjoint reference effect in the examples of 51, indicating that complementarity between reflexive and non-reflexive pronouns holds in these contexts, and therefore that the examples in (43)–(48) do in fact show what we initially took them to show. These are true reflexives subject to Principle A of the binding theory and their well-formedness signals the presence of a structural subject by way of which the reflexives satisfy Principle A. The examples of (51) suggest exactly the same conclusion – the structural subject in these cases is the binder which is too local and which therefore forces a violation of Condition B. That is, we have exactly the outcomes we predict given the hypothesis that Why-VP structures must contain a structural subject.

We introduce these observations because it is important to demonstrate that the possibility of binding from the implicit subject of Why-VP does in fact exist and secondly because an interesting generalization emerges from this data, one which was first noted in Johnson (1975), which is now grounded in naturally occurring data. The φ-features of the anaphors in (43)–(48) vary in gender, person, and number. The implicit subject of Why-VP therefore should not be taken to be necessarily arbitrary or unspecified. Rather it seems to be the case that aspects of the discourse context can influence the interpretation of the subject – such that it can have the kind of specific reference associated with first or second person pronouns, for example – properties which can then manifest themselves on the form of the reflexive. I leave a more precise characterization of the implicit subject of Why-VP for future work. For now, we are interested only in establishing its syntactic presence.

4.1.2 Reciprocals

We can add to the evidence from binding that there is a syntactically active subject in Why-VP by also looking at reciprocals. Reciprocals in English seem to be more strictly subject to the Principle A condition (it is either impossible or more difficult to coerce them into logophoric interpretations). They can thus provide additional evidence for the presence of a structural subject.

They cannot occur in the contexts that license the logophoric reflexives of (49):

(52) a. *The parents said that [the rain had damaged pictures of each other].
  b. *In their opinion, physicists like each other, are rare.
  c. *They boasted that [the queen invited Lucie and each other for a drink].

There are no examples with each other in the NYT corpus examples, but an informal Google search confirms that it is possible to get reciprocals inside the Why-VP fragment:

(53) a. We want to be ready for anything in life, but why insult each other in trying to destroy our maturity?20
  b. If we cannot use modern tools to fight primitive thinking, why insult each other? Why hate each other?21
  c. Why not eat each other?22

The fact that reciprocal anaphors do not in general permit logophoric binding means that examples such as (53) add to evidence for a structural subject in Why-VP clauses.

4.2 Control

Why-VP fragments can subsume structures that are in general analyzed as involving Control. That is, they may contain infinitival clauses whose silent subjects are of the type usually analyzed as controlled – bound by a c-commanding element. Consider, then, examples such as (54)–(56):

(54) (NYT 102405)
  For now, the NFL is willing to wait and see what develops.
  Why make a decision before — having a chance to look at all the alternatives?
(55) (NYT 24662)
  I don’t understand what’s accomplished by a gratuitous insult? Does a bad situation really need another act of unkindness to set it right? When someone does something you don’t like, why not approach the situation with the aim of being helpful, and if that fails, then get even?
  Why choose — to add more negativity to the world?
(56) (NYT 36987)
  Dodgers rookie Hideo Nomo is almost sure to start Tuesday night’s All-Star Game, after Atlanta Braves ace Greg Maddux suffered a groin strain Thursday that almost certainly will keep him from pitching in the midsummer classic in Arlington, Texas.
  Why not take a couple extra days _ to recover?

In (54), there is a clausal adjunct headed by before which does not realize a subject but is nevertheless well-formed. The subject of having is interpreted as a variable bound by whatever is in the subject position of the matrix clause. The intuitive interpretation of (55) of course is, why would x choose (for x) to add more negativity to the world? That is, the subject of the lower clause is referentially dependent on the subject of choose in exactly the way familiar from other cases of control. But if obligatory control of the sort seen in (55) and (56) is always dependent on the presence of a structural binder which commands the embedded subject (see Landau 2013 for extensive arguments in favor of this position) then there must be such a structural binder in (55)–(56).

Once again, the generalization that emerges is that there is a structural subject capable of being an appropriate binder for embedded or controlled PRO.

4.3 Passive

If we accept standard analyses, the passive construction in English is derived by A-movement, where a DP complement to V is raised to the closest commanding specifier of T position. In passive structures, no external argument is projected (at least in what is assumed to be its canonical spec-vP position), and the internal argument of the verb is therefore a legitimate goal for a T probe, and is raised to its specifier position.

Passive is possible in Why-VP fragments, which suggests two things about the structure: one, there exists a voice head (we’ll assume it’s little-v) in these constructions (responsible for the appearance of passive morphology on the main verb), and two that there is enough structure to ensure that the internal argument of the verb can be promoted to a higher “subject position.” Consider the naturally occurring example below:23

(57) (NYT 180878)
  “I look at it this way: What do we have to lose?”
  “Why be intimidated?”

In Why-VP the argument which raises is silent, but in all other respects the syntactic mechanisms at work here seem to be just those at work in full clauses, implying not only that we need a voice (i.e., little-v) head as part of what we have been calling informally VP, but also that we need a projection like T above Voice within Why-VP – whose morphosyntactic properties are responsible for attracting the internal argument to its specifier position, just as in infinitival full clauses.24

4.4 Quantifier float

Floated all implies the presence of a structural subject, either an overt DP which has been raised or a PRO which has been raised (Baltin 1995; Bobaljik 2003). Quantifier float is an available option in Why-VP structures (most clearly in a particular sub-type, as we will see); since the subject of Why-VP is always null, we must be dealing with cases in which the host which “floats” the quantifier is null PRO.

First let’s consider floated quantifiers licensed in certain infinitival contexts:

(58) Baltin (1995: 211, 222)
  a. To all have been doing that would have been inconvenient.
  b. I persuaded the men all to resign.
  c. The men promised me all to resign.

Why-not-VP structures, also, seem to be quite natural with floated quantifiers:

(59) a. Everyone needs a getaway, why not all go together?
  b. Why not all get together and make it happen?

However, Why-VP and Why-not-VP seem to come apart at this point. Floated quantifiers appear to be more marked in Why-VP in the absence of negation. I was unable to find any fully convincing naturally occurring examples of Why-all-VP and my own intuition is that the relevant examples are at best awkward, though it has been suggested to me that (60c) sounds slightly better than the others:

(60) a. ?? Why all choose that framework to work in?
  b. ?? Why all upset each other?
  c.   ? Why all leave together?

A puzzle remains, then, concerning this contrast.25 But the evidence from Why-not-VP cases at lest seems clear and from them I tentatively conclude that quantifier float is in principle available in Why-VP. There is reason to believe that PRO is only licensed in the subject position (Spec, T) of nonfinite clauses (Radford 2004). To the extent that such premises hold and to the extent that the control clauses in (58) and the Why-VP examples above are well-formed, the observations again point to the existence of a structural subject position – specifier of non-finite T specifically – and evidence that that position is occupied by PRO.

4.5 Summary

What has been established so far is that Why-VP structures are surprisingly routine as far as the syntax of subjecthood is concerned. There is evidence that they contain a subject position which can host binders for reflexives, reciprocals, and PRO, and which can be the target position for A-movement processes (notably passive and raising) which standardly target the specifier of T position. We conclude from all of this that there must be a null functional head above vP which hosts the subject of Why-VP in its specifier. These conclusions suggest a structure such as that in (61):


I will remain agnostic for now about the category that hosts the null subject in Why-VP, using the label F above to refer to it but I will note that the properties must be at least close to those usually attributed to the tense head of infinitival clauses.

With this much established, the next task is to decide how why fits into the structure in (61). Traditional analyses of why and other wh-words take them to be phrasal and hold that they undergo A’-movement to the specifier of C (or one of the heads defining the C-domain in frameworks which assume a more articulated structure). However, we will see in what follows that the why of Why-VP is different in a number of ways from the why of unreduced clauses, and that the facts therefore require a different analysis of why in the contexts we are concerned with here. We must also find a way to understand that the presence and the properties of that null functional head (F above) are contingent on the appearance of why. These and related questions are addressed in Section 5.

5 Why as a selecting head

It is not traditionally assumed that why should act as a selector or a head. The standard analysis of why, like other wh-elements, is that it is phrasal and that it perhaps raises for interpretive purposes to a specifier in the left-periphery. However, the literature is replete with evidence that why is distinct from other wh-phrases cross-linguistically. It is not controversial to assume that why is externally merged higher than other wh-elements; in particular, there is accumulating evidence that why can be merged initially somewhere in the left-periphery.

Building on earlier work, Rizzi (2001) argues that why should be externally merged (at least as an option) in the left-periphery. More specifically, Rizzi (2001) argues that in Italian why is base-generated in IntP, which is a position in the left-periphery above FocP, while other wh-elements are initially merged lower in the verbal domain and raise to IntP eventually. Building upon this initial proposal, Shlonsky & Soare (2011) argue that actually some instances of why, specifically those that elicit a response pertaining to reason, are generated slightly below IntP in ReasonP, a position still higher than the initial merge position for other wh-elements.

Bromberger (1992) establishes that why is the only interrogative phrase that can be answered differently depending on the phrase which is phonologically focused in its domain. He takes this as evidence that why takes scope outside of focus operators, while other wh-adverbials like when and where are introduced below the focus operator, raise above focus operators, and bind a trace. These strands of analysis converge on the same conclusion; why need not bind a mid-sentence trace, therefore why may be first merged higher than other wh-elements.

Chapman & Kučerová (2016) suggest that ultimately why is ambiguous semantically and syntactically. The purpose reading of why reflects a derivation in which why is first merged higher – in ReasonP. In Why-VP both kinds of answers and interpretations appear to be possible:

(62) Why go to the grocery store at 10pm?
  A: To avoid the crowd. Purpose
  B: Because I like shopping at night. Reason

For Chapman & Kučerová, the above would indicate that why in Why-VP is ambiguous between being first merged in the left-periphery and being first merged in the verbal domain. However, as we have already seen, there is an interesting locality constraint on why in Why-VP structures (Collins 1991; Bhatt 1998), which would suggest that why in Why-VP is always first merged high (despite having both a purpose and reason interpretation):

(63) Why say that John likes Susan?

As we have seen, in cases like (63), why can only be construed as modifying the matrix saying event description. In other contexts, however, why can be construed in either the matrix or the embedded clause of a bi-clausal structure with both purpose and reason interpretations as in (64):

(64) Why did she say that John likes Susan?

Why can be construed in (64) with the embedded liking event or with the matrix saying event.

In general, why also partakes in inducing islandhood:

(65) *What time did you wonder why John would see Spiderman?

This suggests that why does and can occupy an intermediate specifier position, thereby blocking off further movement out of the clause below it. It is therefore unclear what would stop it from being construed in an embedded clause in a Why-VP structure such as (63). Some mechanism is required to restrict why to always being first merged in the matrix clause of a bi-clausal Why-VP structure. How should we impose this restriction?

If we were to posit a null head in the left periphery of Why-VP, something like Rizzi’s IntP, in whose specifier why appears, what would block an alternative derivational path on which why is first merged as an adverbial to an embedded VP and then raised to the root specifier of IntP? It is hard to see what would block such a derivation on standard assumptions. But the option is absolutely unavailable in Why-VP. Although several researchers have noted that why is special syntactically in having the option of being merged high, any account that analyzes why as a phrase in the specifier position of a related head will be hard pressed to block a derivation in which why raises from an embedded position (an option evidently freely available in other contexts), thus incorrectly predicting low construals of why in Why-VP.

An alternative would be to posit that why in Why-VP at least is a head. Collins (1991) observes that how come is subject to the same locality constraint, in that it is unambiguously interpreted in the matrix clause of a bi-clausal structure. Collins takes this in combination with the lack of T to C movement in how come questions to argue that how come is a head rather than a phrase. I will argue for a similar analysis of why in Why-VP here.

Why is a lexical item (monomorphemic) and therefore has the potential to bear a selectional feature. If we assume Bare Phrase Structure (Chomsky 1995), once we say that why can select, we expect it to project its label, thus making Why-VP a WhyP.

In the framework of Bare Phrase Structure, maximality is a relational property determined in the course of syntactic structure building. If a lexical item combines with some syntactic object α forming β:


There are two ways of interpreting the resulting structure syntactically. If L selects α, L labels β (is its head) and is minimal. If α selects L, α labels β (is its head) and L is maximal. Thus if why in the lexicon of English may optionally select the null functional head whose existence we have been arguing for, we expect it to take part in syntactic composition in two very different ways.

Along one derivational path, it will select null F and label it producing in effect a WhyP. This is Why-VP. Along a different derivational path, it will lack the crucial selectional feature, it will enter the derivation in the familiar way, be defined as maximal (not being a selector) and will potentially undergo phrasal movement to an A-bar position.

On this view, then, what is crucial in determining that English has the Why-VP construction whose properties we are exploring is that the lexicon of English includes an instance of why which happens to select the null functional head (F in (61)) with the relevant interpretive properties. Note that it will not do to maintain the conventional assumption that why appears in the specifier of null C as a phrase in these cases, since that C is presumably indistinguishable from the C that hosts phrasal wh-elements like for what reason. Such phrases do not of course support Why-VP:

(67) *For what reason take Structure of Japanese?

Analyzing the why of Why-VP as minimal will allow us to understand the quirky part of the construction, that the bare VP is restricted to the presence of why and also why there is no Why-VP interpretation consistent with a structure in which why is first merged in an embedded clause. This approach traces the existence of Why-VP in English to a lexical property of an element of its functional vocabulary (that why may, but need not select F) and it leads us to analyze the WhyP option as in (68):


A prediction this analysis makes is that we should expect adverbial elements which are phrasal, like exactly, to be unable to attach to why in Why-VP. For many speakers, the following is reported as marked:26

(69) a. ?? Why exactly/Exactly why shop at 10pm?
  b. ?? Why exactly/Exactly why take Structure of Japanese?

In finite why-questions, exactly is perfectly acceptable:

(70) a. Why exactly/Exactly why would you shop at 10pm?
  b. Why exactly/Exactly why should you take Structure of Japanese?
  c. I don’t know why exactly he would shop at 10pm.
  d. I don’t know why exactly he should take Structure of Japanese.

It is clear, I think, that (69) is more degraded than those of (70) indicating that it is right to think of why as a minimal projection rather than maximal. Why-VP does permit the hell or on Earth to occur between why and the vP, which, according to Merchant (2002) is consistent with an analysis on which why is analyzed as a head here. First consider:

(71) a. Why the hell change a winning formula that worked so well against Liverpool at the weekend?27
  b. Why on earth sell her jewels?28

In the right context, on earth and the hell seem to be quite natural, while why-exactly-VP seems to be impossible. I foresee variation in these judgments, but examples like (71) seem to be consistently rated as good, while there is some inconsistency with respect to why-exactly-VP. This seems to be the trend among the small sample of peers I have surveyed, at least. Regardless, Merchant (2002) observes that modifiers like exactly only attach to phrases (see Zyman 2018 for arguments for the same conclusion), while on earth and the hell modify heads. The following examples are noted:

(72) Merchant (2002: 14)
  a. Who the hell was he talking to?
  b. What the hell was he talking about?
  c. When/where/why the hell was he talking?
(73) Merchant (2002: 14)
  a. *What book the hell was he reading!?
  b. *What kind the hell of a doctor is she, anyhow!?
  c. *What kind of a doctor the hell is she, anyhow!?
(74) a. *Which exactly train did they take?
  b.   (Exactly) Which train (exactly) did they take?

Assuming that Merchant (2002) has the right syntactic characterization of the above modifiers, then the relative unavailability of exactly in Why-VP and the availability of the hell in Why-VP would suggest that the characterization of why as a head in Why-VP is the right one.

A related issue (raised by an anonymous reviewer) has to do with the modifier else which is in general unproblematic as a modifier of why:

(75) a. Why else would they be so intransigent?
  b. Why else should we care?

Its behavior in Why-VP does not seem to be consistent across speakers:

(76) a. ?Why else do that?
  b. ?Why else leave your phone at home?

There are some speakers for whom the above are acceptable, and some for whom they are marked. Naturally occurring examples are easy to find:

(77) a. Why else oppose a woman’s right to choose birth control?29
  b. Why else demand a change in direction?30
  c. Why else require every student seeking an Adderall prescription to call their parents and see a mental health professional?31

Such variation has also been reported for the exactly cases in (69), but initial investigation suggests that else is more widely available in Why-VP than are exactly and precisely. Examples like (77) are easy to find online; examples with exactly are not.32 If that conclusion survives more careful investigation, it suggests a connection with another difference between else and exactly. It is routinely possible to strand exactly in wh-questions:

(78) a. Where do you want to go exactly?
  b. Why do you want to do that exactly?

But for the vast majority of speakers in English, else is not similarly strandable:

(79) a. *Where do you want to go else?
  b. *Why do you want to do that else?

Zyman (2018) develops an analysis of stranding exactly which depends crucially on its status as a phrasal modifier. If else is not similarly phrasal we understand the impossibility of stranding in (79). If it is a head adjoined to a head in a compound-like structure we would make a link its failure to strand and its relative acceptability in Why-VP (see (77) above).

All of these suggestions are tentative and much remains to be done in the area, but the idea that why in Why-VP is a syntactic head seems to provide a least a promising starting point for that investigation.

A final and welcome outcome in assuming Why-VP is a WhyP is that we can understand its root character – its resistance to selection. Recall that Why-VP structures cannot be embedded:

(80) a. *I don’t know why take Structure of Japanese.
  b. *I wondered why take Structure of Japanese.

Verbs like wonder and know will select a CP, but Why-VP is not a CP and as such, it will not be selected. Thus this is analogous to accounts that have suggested that why + to is impossible in embedded contexts because why is merged relatively high and therefore within structures which are larger than those selected by the predicates which generally embed questions (Shlonsky & Soare 2011; Barrie 2007).

In addition and crucially, these proposals predict one of the most curious of the properties of Why-VP – the locality constraint by which why must be interpreted in the matrix clause, rather than in an embedded clause. This is one of the most mysterious properties of the construction in the context of more familiar analyses of clause-initial why (in terms of phrasal movement from a lower position). In the present context, by contrast, it is an inevitable property of the construction. Why in (68) has no occurrence in an environment lower than the matrix environment. It follows that in an example like (63) there will be no mechanism by which it could be interpreted in a position lower than the matrix environment.

Finally, we also now have a way of thinking about the problem highlighted at the end of Section 4 – how we make the appearance of the clause structure schematized in (68) above contingent on the appearance of the lexical item why. This now reduces to a very familiar kind of head to head selection (see (68)). In the structure we have argued for here why is minimal and enters into a selection relation with the head F of (68). The relationship between these two elements is now the central fact about Why-VP and the one that determines the cluster of properties (syntactic and semantic) which we have been investigating here. We should now expect that many of the crucial properties of Why-VP will in fact be properties of F. As we will see in the final section, the relevant semantic properties are far from trivial and in exploring those properties, I will point out many parallels between the modal interpretations characteristic of Why-VP and those which have been observed for embedded wh-infinitivals in Bhatt (1999/2008). With an understanding of those properties in place, we will be in a position to draw together many of the threads of the discussion so far.

6 The characteristic modal of Why-VP

If it is right to conclude, as I have argued, that the core properties of Why-VP structures emerge as expected consequences of well-established routines of syntactic and semantic composition, we can turn to what now seems to be the core puzzle concerning Why-VP – the character and origin of the characteristic modal force it expresses. This will entail engaging with some very difficult issues, involving in particular, the interaction between between modality and aspect. This discussion will reach no final conclusion, but my hope is that it will be useful as an initial exploration of an issue that has many ramifications.

6.1 Interpretive properties

We start with the vagueness of the interpretation of the modal in Why-VP, which, intuitively, seems to be a close semantic correlate to the overt modals should and would. It is unclear, at present, what contextual factors or what kind of semantic composition governs the evocation of the interpretation for either of these modals in Why-VP – but given the same context, some speakers report should and some, would as the best paraphrase for the following example:

(81) (NYT 100526)
  “… There are 28 drivers. It’s just huge. Even with sponsorship from Chevrolet and Target, it’s a costly proposition.”
  The big question: Why do it?
  “I love it and I can,” she says.

Both should and would being futurate and having universal force may make finding contexts in which only one interpretation is felicitous difficult. Additionally, the semantic composition of the modal may be underspecified allowing both interpretations rather freely. These are matters to explore in future work.

It is clear, though, that context plays an important role in determining which paraphrase emerges as most appropriate. Take, for example, the initial example presented in the paper, repeated here:

(82) Why take Structure of Japanese?

This naturally occurring example above occurs with no preceding linguistic context; it was found on a poster advertising why one should take this class. Accordingly, the best paraphrase here seems to be should. However, we can manipulate the context such that would becomes a more obvious paraphrase:

(83) Context: A student goes to their undergraduate advisor and says, “I’m thinking about enrolling in Structure of Japanese next quarter, but I’m afraid it’s going to be too much work.” The advisor says, “Well, I see here that you’ve already completed all five of the electives you need to graduate,” so why take Structure of Japanese?

While would certainly seems like the best paraphrase of the Why-VP clause above, a paraphrase with should does not seem out of the question. A complete analysis of this modal will have to provide an understanding of this vagueness. For now, we can say that some contexts seems to favor certain interpretations of the modal in Why-VP.

Evidence that there is a connection between the interpretation of the covert modal in Why-VP and the overt modals we have just been discussing comes from their interaction with NPIs in why-questions. Recall that Bhatt (1998) observes NPIs are licensed in Why-VP:33

(84) a. Why [do a damn thing about it NPI]?
  b. Why [lift a finger NPI]?
  c. Why finish [any NPI] of my homework?

In finite why-questions only certain modals, namely, should and would, appear to license NPIs, which are exactly the modals which seem to figure in the interpretation of Why-VP.34 Consider:

(85) a.   Why should we talk anymore/do a damn thing about it?
  b.   Why would we talk anymore/do a damn thing about it?
  c. *Why could we talk anymore/do a damn thing about it?
  d. *Why can we talk anymore/do a damn thing about it?
  e. ?Why must we talk anymore/do a damn thing about it?35
  f. *Why may we talk anymore/do a damn thing about it?
  g. *Why will we talk anymore/do a damn thing about it?
  h. *Why might we talk anymore/do a damn thing about it?

If the interaction between NPIs and modals above is a semantic one, a constraint on interpretation, as seems likely, then we should presumably conclude that the semantics of the covert modal of Why-VP structures is close to, or identical to, the semantics of would and should, given that Why-VP also licenses NPIs. Bhatt (1998) attributes the NPI-licensing possibilities in Why-VP to its rhetorical nature, which is cashed out via a semantic composition specific to Why-VP, in which the provided positive polarity is flipped to be negative. However, given the observations in (85), we can see that no such technology is needed – we can attribute the NPI-licensing properties of Why-VP to its modal character, whose interpretation we have already identified as a close approximate to should and would.

6.2 Bhatt (1999/2008): Covert modality in non-finite contexts

We turn now to the syntactic characteristics of Why-VP that should lead us to expect this modal interpretation. There are parallels between the modal interpretation of Why-VP and the covert modal identified in Bhatt (1999/2008) characteristic of embedded wh-infinitivals – exploring these will likely be a good starting point to understanding the composition of the covert modal in Why-VP. It’s also clear that Why-VP clauses should be identified with non-finite clauses more broadly – there is evidence of a structural subject that is obligatorily silent, much like the subject of non-finite clauses, PRO. The VP is obligatorily tenseless, uninflected, with an irrealis interpreation, all of which are properties associated with non-finite clauses.

Bhatt observes that non-finite wh-clauses (wh-infinitivals) are always modal, with an interpretation that varies between should, would, and could. The following examples are adapted from that discussion.

(86) a. Zach knows [who to talk to at the party].
  b. Charlie knows [where to get gas].
  c. Haley decided [where to get gas].

The interpretation varies depending on various factors, including choice of wh-phrases, choice of embedding predicate, and certain elements of the discourse context. Noting in particular that the modal interpretation can vary in force (though not in flavor), between having a necessity interpretation (should or would), or in having a possibility interpretation (can/could), Bhatt wants to maintain one modal operator occurs in these kinds of clauses.

The covert modal that Bhatt proposes is one that is interpreted relative to a circumstantial modal base, thus ruling out the possibility of an epistemic interpretation. The force of the modal is one that is underlyingly existential with a deontic ordering source, though it might be better described as a teleological modal in present terms. The operator postulated draws on the denotation provided in Kratzer (1991) for a possibility deontic modal, except that a restriction is added that we only consider worlds in which the matrix subject has achieved the goal set by the context. It’s the existential force of the modal that is responsible for the could interpretation, and it is the added restriction that is responsible for the should interpretation. What determines whether one part of the denotation is relevant here depends crucially on the context.

For example, in (86a) should may emerge as the best paraphrase. This is due partially to the properties of the wh-phrase who but is also due to some knowledge about Zach’s goals given in context. If we know that Zach’s goal is to become popular, then in (86a), knowing who to talk to means knowing who it is necessary to talk to in order to achieve that goal. Bhatt’s restriction that we only consider worlds in which Zach meets his goals means that we only consider worlds in which he has also talked to a specific group of people.

In (86b), depending on the context, we will either get should or could. First imagine a context in which Charlie’s goal is simply to get gas. The wh-infinitival and the goal, in this case, match – Bhatt’s restriction is trivially satisfied – we only consider worlds in which Charlie gets gas. Any and all locations in which Charlie gets gas will satisfy his goal. When the restriction is trivially satisfied, the underlying existential force will take effect and we observe a can or could interpretation. Now, consider a context such that Charlie’s goal is to get gas from some ethically commendable source – there is now additional content in the goal that is not in the wh-infinitival – the restriction is no longer trivially satisfied. His goal will not be met in getting gas from any filling station he might happen on. In that context, it seems intuitively correct, that we understand should as the best paraphrase.

In (86c), the effect of the embedding predicate decide likely has to do with invoking the futurate interpretation of would – though not much more than this is said by Bhatt. The core insight though is that the interpretation of the modal depends heavily on the context of use.

Why-VP certainly has in common with embedded wh-infinitivals both the necessity interpretations should and would.36 It will be difficult to fully import Bhatt’s modal to Why-VP, however, given that much of his analysis derives from his examination of contexts in which Why-VP may not appear – the complement position of factive know. Why-VP as we have seen, can never be embedded. Furthermore, it’s not clear how would is derived in his system, a central element of the modal meaning characteristic of Why-VP. However, that embedded wh-infinitivals and Why-VP have very similar interpretive effects should not be taken to be a coincidence, especially given that Why-VP independently exhibits many properties associated with infinitival clauses. It seems natural, now, to formally characterize Why-VP as non-finite – a characteristic I will localize as a feature on the head I call F. This will have the right syntactic effects – we will expect there to be a structural subject PRO in its specifier and a bare vP as its complement. In addition, we will predict a modal interpretation that is variable in precisely the way we observe (to the exclusion of could). The analysis proposed for Why-VP contributes to a larger generalization: non-finite wh-clauses are always interpreted as modal. We should now understand Why-VP to have the following structure:


Our proposals, however, will depart from Bhatt’s on the question of where the modal properties of the construction are localized. Bhatt assumes that the modal in embedded wh-infinitivals is hosted at C[+wh, –fin], correctly predicting that wh-infinitivals will always be modal. I have argued here that Why-VP is not a CP, rather, it’s a WhyP. It does not seem right that the semantics for why and the semantics for the modal be introduced by the same head; we would be forced to claim, for instance, that the semantics of why in Why-VP is different from its semantics in other (finite) contexts. This seems wrong. Instead, I’d like to propose that the modality comes from within the clause itself, specifically at F. The why of Why-VP as a head will select this modal; thus, the cluster of properties we have been exploring fall out from this syntactic relation.

The precise composition of that modal is still an open question, but it must be related to that which was identified by Bhatt. Positing the existence of this modal in Why-VP gives us a way of understanding the intuition that underlies many of the earliest investigations of Why-VP. The oft noted rhetorical nature of Why-VP is a property of being a why-question that contains a modal akin to should and would. Note that finite why-questions with these modals are able to have the directive force associated with Why-VP:

(88) a. Why would you leave the party now (the band is about to go on…)?
  b. Why leave the party now (the band is about to go on…)?
(89) a. Why should I vote for a Republican (their party’s a mess these days…)?
  b. Why vote for a Republican (their party’s a mess these days…)

The attitude in the modal why-questions above is essentially that which was observed by Duffley & Enns (1996) to be characteristic of Why-VP: the speaker conveys that they see no reason that the prejacent of the modal should occur. This has the effect of conveying that one should not take the course of action offered in the prejacent, given that there is no good reason to do so.

We should promptly say, though, that this is not the exclusive function of these modals in why-questions. These modals can occur in why-questions and express a genuine wondering about the reasons someone should or would take some course of action – still here, the speaker sees no reason to VP, but they ask precisely because they see no reason to VP, not because they intend to instruct their addressee – see (90) and (91) below:

(90) a. Why would they make this move now?
  b. Why make this move now?
(91) a. Why should we sign up for Structure of Japanese?
  b. Why sign up for Structure of Japanese?

The modal why-questions above need not be interpreted as making a suggestion (though they could be) – they can simply ask a question pertaining to reason and necessity; the same goes for Why-VP. We can conclude from this, then, that it is the modal, argued to be present (and covert) in Why-VP that links the rhetorical and non-rhetorical uses. The precise syntactic characterization of that modal and its compositional properties are certainly worth exploring further, as there do seem to be certain syntactic and semantic configurations that the covert modal of Why-VP cannot participate in that is available to its overt modal counterparts, issues I will explore in the following section.

6.3 Aspectual puzzles

The covert modal of Why-VP and its overt counterparts come apart in surprising and interesting ways, which will be important to identify here for future work on this construction and hopefully, for work on covert modality in general. The covert modal of Why-VP appears to be far more limited than its overt counterparts in the range of auxiliaries that it can co-occur with. In particular, aspectual auxiliaries appear to be wholly excluded in Why-VP:37

(92) a. *Why have finished your homework by 5pm?
  b. *Why have made chicken soup?
(93) a. *Why be finishing your homework?
  b. *Why be making chicken soup?

However, in finite why-questions with should and would, no such restriction exists:

(94) a. Why should/would you have finished your homework by 5pm?
  b. Why should/would you have made chicken soup?
(95) a. Why should/would you be finishing your homework?
  b. Why should/would you be making chicken soup?

One reason I have remained hesitant to adopt an analysis where, what I have called F is really a variant of non-finite T is because T is generally associated with the presence of aspectual markers; given that they are rejected in Why-VP, it’s not clear that what is generally associated with a tense head should exist in Why-VP.38 Furthermore, as we will discuss below, it is not clear that the semantic properties associated with T, tense, should exist in Why-VP, particularly because have does not seem to be able to contribute a past temporal perspective. Such an effect is generally possible for priority modals in the context of have.

An analysis of the covert modal in Why-VP, its compositional properties and structural representation, will need to account for this puzzling fact. It will not do to simply assert that the covert modal of Why-VP is a silent version of the operator associated with the lexical items should and would. Modals are associated in general with futurate or irrealis interpretation, so much so that it is common to assume that the future orientation is built into the denotation of modal operators. However, that does not mean that aspects of their temporal properties can’t be shifted to the past with have. This is most famously explored in Condoravdi (2002), and more recently by Rullmann & Matthewson (2018).

Condoravdi distinguishes between temporal orientation and temporal perspective, which are crucially distinct both in their syntactic and semantic representations. Temporal perspective is contributed by T, while temporal orientation is contributed either by an aspectual operator scoping below the modal, or is built into the composition of the modal itself. Semantically, the temporal perspective refers to the evaluation time of the modal with respect to the utterance time, while temporal orientation refers to the event of the modal’s prejacent with respect to the utterance time.

Condoravdi argues that depending on the flavor of the modal, auxiliary have will affect these temporal dimensions differently. For example, have in the context of an epistemic modal will shift the temporal orientation to the past (but not the perspective), while for metaphysical modals, have shifts the perspective to the past.39 In the basic case, a modal utterance without have, the future orientation we observe is part of the modal’s denotation. Thus, there are two ways temporal orientation manifests under this view: either the denotation of the modal contributes it or an aspectual auxiliary (e.g. have), scoping below the modal, contributes it.

On the other hand, Rullmann & Matthewson (2018) argue that modals uniformly get their temporal orientation from an aspectual operator that scopes below the modal. In English, this non-past (future/present) aspectual operator is covert, while overt for the past aspectual operator, realized as have.

Let us assume that the modal of Why-VP is of a teleological flavor, essentially as was proposed by Bhatt for wh-infinitivals. While certain modal flavors are incompatible with a past orientation, e.g., metaphysical modals and pure circumstantial modals, Thomas (2014) argues that teleological modals are not incompatible with a past orientation. If that’s true, the modal of Why-VP then should not be semantically incompatible with have.

It may be tempting at this point to attribute the unavailability of have in Why-VP to the directive force associated with its use. It is infelicitous to suggest or request a course of action in the present take place in the past. As an anonymous reviewer points out, these aspectual restrictions are reminiscent of imperatives. However, champions of this view will need to contend with the genuine questions identified here – there should be no reason that one cannot ask a question about the past (see (94) above).40

We might have recourse to a syntactic analysis along the lines suggested by Rullmann & Matthewson (2018) – the modal of Why-VP selects only the covert future/present aspectual operator. This would predict that Why-VP should be able to have both present and future orientation, provided we have the right lexical aspect (stative for present and eventive for future). This prediction is borne out:

(96) a.   Why be upset right now? Present
  b.   Why leave for the Olympics tomorrow? Future
  c. *Why leave for the Olympics yesterday?41 Past

An anonymous reviewer suggests a close alternative: the modal of Why-VP selects vP directly. On this view, the passive auxiliary is ruled in (presumably hosted at v), and higher auxiliaries, have and progressive be are ruled out.42 While it is clear that we have the theoretical tools to capture this observation, we cannot yet understand why the covert modal must be deficient in this way.

There seem to be restrictions on the interpretation of the covert modal that simply don’t hold for its overt lexical counterparts. This general finding is not a novel one – Hackl & Nissenbaum (2012) find several contexts in which the covert modal posited for infinitival relatives (e.g., many things for us to do) in which it is restricted to one particular modal force, despite being variable in force in other contexts. They consider infinitival relatives in the context of strong and weak determiners – when the head of the relative clause is weak, should and could interpretations are possible, while for strong determiners, the covert modal can only be interpreted as should. However, finite relative clauses show no such restriction – there is no ban on the overt modal could occurring in the context of a strong determiner. This is very puzzling – the covert modal can in principle express existential force, but seems to be blocked from doing so in certain contexts.

There is likely something about being syntactically non-finite and its interaction with the surrounding context that limits the interpretive possibilities. It seems to me worth exploring further the commonalities among these covert modals and where the source of these interpretive restrictions might lie – are they underspecified in their denotation in a way that overt modals are not? And if so, in precisely what way? Or does the syntactic frame in which they occur contribute these deficiencies? What would that mean for the structural decomposition of modality? And ultimately, can these covert modals be unified in their composition in some way? These are questions I leave open, but I hope to have established that they are very much worth probing further, especially as the empirical domain of covert modality expands.

7 Conclusion

In this paper, I’ve shown that the core properties of Why-VP, a very productive and under-studied construction, emerge in a routine way from standard compositional mechanisms (syntactic and semantic) that are well-established and well-understood. The analysis I have developed makes no appear to any conventional or idiosyncratic linkage between the syntactic frame and the various interpretive properties that have been discussed.

The status of why in Why-VP as a head that bears a crucial selectional feature, a compositional possibility granted by the theory of Bare Phrase Structure, will allow us to understand the following properties: how why but not other or larger expressions with the same interpretation combine with this bare VP, the local construal of why, and its root distribution as a reflex of the projection of why, a clause which embedding verbs don’t select. It is why that selects a null functional head, with properties akin to non-finite T, in that it hosts the silent subject PRO, accounting for the routine subject properties for binding and A-movement Why-VP exhibits. This non-finite functional projection will also be responsible for the modal semantics of Why-VP and we make a link with the modal semantics of wh-infinitivals (similar in many respects) by assuming commonalities of structure and interpretation between the two structures.

The distribution of NPIs in Why-VP remains puzzling given that the licensing of NPIs in wh-questions is poorly understood, but we know independently that one of the factors is modality, and the overt modals which license NPIs in finite why-questions are the ones whose interpretation closely parallels that of the covert modal characteristic of Why-VP.

We are now in a position to return to earlier discussions of Why-VP and ask: why have so many been tempted to say that Why-VP is necessarily rhetorical? Because why-questions with overt modals whose interpretations correspond to the modal force of Why-VP are also very often rhetorical (e.g., why would you support an Independent? or why should I leave the house?). Why does Why-VP often have the flavor of a suggestion or a mild imperative? Because its semantics lends itself to a rhetorical use and the rhetorical use of Why-VP communicates that there is no reason why one would, should, or could VP. But the perlocutionary effect of communicating that is in turn going to very often be: don’t VP. Crucially, there is no syntactic reflection of these effects (contra Gordon & Lakoff 1971; Bhatt 1998) the crucial atoms for syntactic and semantic composition are just the familiar why, v, and V, and an infinitival head whose properties define an A-position specifier.

At this point the analysis of the ultimate source of modal force in Why-VP structures remains in an inconclusive and un-satisfying state. However what I hope to have established in this paper is that once we better understand that in its syntax and semantics Why-VP is a species of infinitival wh-question, and that its syntactic composition is not so exotic, we are then in a better position to engage the important, interesting, and still open questions that it actually raises – about modality, how modal meanings are composed and how they are expressed syntactically.

Additional File

The additional file for this article can be found as follows:

Supplementary file

Why-VP occurrences with their discourse context from the New York Times Gigaword corpus. DOI: https://doi.org/10.5334/gjgl.870.s1


  1. We will see in Section 4.3 that passive auxiliaries are possible. However in Section 6.3, we will see that the auxiliaries posited to be in higher positions like perfect have or progressive be are not possible. [^]
  2. Duffley & Enns (1996) note somewhat old corpus data that suggest that at one point how-VP was in more regular use and appears to convey, on the part of the speaker, a similar attitude to that which we observe for Why-VP:
    (i) How leave her there?
    (ii) How tell her that it would have been an outrage, a sing, to continue as her lover?
  3. How come is slightly different because there is no T to C movement in these kinds of questions, but the past tense form is still fine: How come you took Structure of Japanese? [^]
  4. The same pattern is also found to be true of PRO in the complement of certain embedding verbs (Lasnik 1992). [^]
  5. An anonymous reviewer points out a naturally occurring example that does not require a manner adjunct:
    (i) (https://www.businessinsider.com/why-obama-may-pass-on-reelection-2011-8) Why suffer the degradation of humiliating defeat?
    Despite having no manner adjunct, the above is well-formed. This is not at odds with the contrast observed in (8). We should expect that if the condition on the subject is merely that it be construed as having agency, rather than a thematic restriction, then there should certain be contexts that license such readings for the addressee, and this example certainly seems to be one of them. [^]
  6. (10c) is arguably different however, and in fact, is probably unacceptable. This conclusion may be strengthened by the observation that there is no example parallel to (10d) in the corpus. This is however, as we predict, given the pseudo-agentive restriction noted earlier for Why-VP. Inanimate objects are not good agents in the absence of personification. [^]
  7. Francez (2017) also observes that for Why-not-VP, NPIs are not possible. He attributes this to the special status of negation in these clauses, arguing that they are external to the clause (TP), and therefore cannot syntactically license the NPI. [^]
  8. Collins (1991) also observes this interpretive restriction. No explanation is offered on his part, but it is suggested that if why is adjoined to the matrix VP, it is not a head and therefore cannot properly govern a trace inside the embedded clause. This will not hold up in current theories, where phrasal why should be able to move through spec-CP of the embedded clause up to matrix clause CP; thus an embedded clause interpretation should and would not be ruled out. [^]
  9. Web page title: linguisticsociety.org/content/why-major-linguistics. [^]
  10. NPR interview: npr.org/templates/transcript/transcript.php?storyId=700173416. [^]
  11. We will see in the next section (2.2.2) that these genuine uses may still have an underlying speaker bias or attitude; one which conveys that they see no reason one should or would VP, but that bias does not necessitate that the speaker directs or intends to direct their addressee, contra Bhatt. [^]
  12. It is important to clarify that by rhetorical we only mean to refer to Bhatt’s characterization of Why-VP as imperative-like in suggesting to the addressee an action of the opposite polarity. There are other senses of the word rhetorical, which may be used to describe examples like (19) and (20), as such uses employ a rhetorical strategy to try to convince someone to get vaccinated or to major in linguistics. Crucially, though, they do not suggest that one not get vaccinated, or not major in linguistics. [^]
  13. From article: thestir.cafemom.com/tv/116575/real_housewives_of_atlanta_reunion. [^]
  14. Duffley & Enns (1996) cite the examples in (23) as such: (23a) is from an article (Time 1992: 25), (23b) from a book title (Tolleris 1946), and (23c) is from The Globe and Mail (1922: A22). [^]
  15. A reviewer notes that need and dare when used in conjunction with a bare VP has a similar interpretive component. Consider:
    (i) Need/Dare he leave?
    In the above, the speaker calls into question the existence of need, similar to Why-VP in that reasons for performing an event are suggested to be non-existent. There are certainly semantic and syntactic parallels here that would investigating in the future. However, such an investigation is outside the scope of the paper. I direct the reader to Duffley (1994) and Duffley & Larrivée (1998 for further discussion of such parallels. [^]
  16. The is not a general restriction for why cross-linguistically. Jedrzejowski (2014) argues that Polish allows embedded wh-infinitivals, including with why. [^]
  17. See (23) above to show that there are some naturally occurring examples of these in root clauses. A reviewer points out another naturally occurring example where why-to is embedded with an intervening not (CBS Prime Time News, March 188, 1994):
    (i) I think everyone should do it, I don’t see any reason why not to do it.
    It seems very rare to have why and to directly adjacent to one another, barring those root uses earlier. It is difficult to dispute, I think, that why + to is rather unproductive in English. [^]
  18. We take no position on other kinds of Why-XP clauses by Yoshida et al., for which an ellipsis account may well be correct. [^]
  19. Example (48) was found in the article: https://www.forbes.com/sites/learnvest/2012/11/30/im-a-spender-my-husbands-a-saver-how-we-make-it-work/$#$5925c1e143a3. [^]
  20. From the book: How Would You Like A Bite of This Fruit, by Rodney Votion. Pg. 135. [^]
  21. From article: nation.co.ke/news/Stop-hate-speech-on-social-media-Uhuru-Kenyatta-tells-Kenyans/1056-2411156-12qvnvb/index.html. [^]
  22. From blog: cracked.com/blog/6-things-that-never-make-sense-about-zombie-movies/. [^]
  23. Example (57) could be analyzed as adjectival. However, it would be possible to modify (57) with a by-phrase indicating a verbal passive: Why be intimidated by the bullies? [^]
  24. It should be noted that in other instances of raising to subject also as expected apply within Why-VP:
    (i) Why seem so glum?
    (ii) Why be so hard to please?
  25. An anonymous reviewer adds to the mystery by noting that even if Why-VP is smaller than a full finite clause, akin to a small clause, there is no reason it should not have a stranding position for all given the acceptability of all in those kinds of clauses: I made the students all leave. [^]
  26. A reviewer suggests that the oddness of (69) could be accounted for on pragmatic grounds. If the function of Why-VP is such that the speaker communicates that they see no reason for the event denoted in the VP to be carried out, and if the function of exactly is to suggest that the speaker assumes that there are reasons to carry out the event, then when the two are put together, they convey conflicting messages on the part of the speaker. This seems dubious however. An account along these lines would need to say something more about why there are speakers for whom the examples of (69) are acceptable. Additionally, it is not completely clear that exactly has the purported effect noted by the reviewer. In evaluating the root questions in (70), the speaker still seems to be presupposing that there is no good reason for people to shop at 10pm, despite using exactly. Finally, we have seen that Why-VP can express genuine questions – still, the modification possibilities with exactly remain. [^]
  27. From the article: https://www.ontheforecheck.com/2017/4/16/15318230/heres-what-worked-shutting-out-the-blackhawks. [^]
  28. From the book: The Great Mistake by Mary Roberts Rinehart. [^]
  29. From the book: Hippie Dictionary: A Cultural Encyclopedia of the 1960s and 1970s, by John Bassett Mccleary. Page 23. [^]
  30. From article: https://www.ft.com/content/08a13850-865a-11e6-8897-2359a58ac7a5. [^]
  31. From blog: https://reason.com/2013/06/20/watch-out-college-kids-sen-schumer-wants. [^]
  32. An anonymous reviewer finds the following naturally occurring example:
    (i) (bad-fannibals-again.wikia.com/wiki/About_CleoLinda)
      It is curious to me as Cleo herself is based out of the US and lives in Alabama so why exactly get a book published in another country?
    This is not well-formed for me, but as stated earlier, there are speakers who accept these. My sense is that they are less frequently accepted compared to why-else, and less frequent overall. [^]
  33. Bhatt also observes that Why-not-VP mysteriously does not license NPIs. It is not yet clear what the syntactic representation of not in such structures is but the fact that the simple presence of not does not license NPIs indicates that it should not be thought of as typical clausal negation (see Hofmann 2018 for work on the syntax of Why-not). An anonymous reviewer suggests that the distribution of NPIs in negated finite modal why-questions might be revealing about the position and scopal properties of negation in Why-not-VP. This is clearly an investigation which would be worth pursuing. It will be complicated, however, by a number of difficult confounds. For one, NPIs are licensed in finite why-questions even in the absence of negation, under conditions that are poorly understood but which seem to depend at least in part on whether or not there is a modal of a particular sub-type, as can be seen in (85) below. Secondly, the rhetorical interpretation associated with these modal utterances in (85) seems, also, to have a hand in licensing the NPI – the rhetorical interpretation associated with Why-VP are not available for Why-not-VP and thus may also contribute to the impossibility of NPIs. So in observing that NPIs are impossible in Why-not-VP, it’s going to be very hard to disentangle which factor is at play – the relative scope of negation with respect to the NPI, the interpretation of the implicit modal, or whether the relevant rhetorical reading is felicitous or not. [^]
  34. The distribution of NPI’s in wh-interrogatives remains poorly understood (see especially Guerzoni & Sharvit 2007 on maximality) but the effect of choice of modal, as seen in (85) seems, as an empirical matter, beyond dispute. [^]
  35. Must seems to be marginally better than the other modals, which could be because priority flavors of must are compositionally very similar to should. [^]
  36. Johnson (1975) notes that could might even be a possible interpretation in Why-VP, observing the following oddity in (iii) below:
    (i)     Speaker A:     Why not just get a drink out of the machine?
          Speaker B:     I don’t have any change.
    (ii)     Speaker A:     Why couldn’t you just get a drink out of the machine?
          Speaker B:     I don’t have any change.
    (iii) ??Speaker A: Why shouldn’t you get a drink out of the machine?
          Speaker B:     I don’t have any change.
    Recalling that Johnson crucially assumes Why-VP to be derived from finite why-questions with overt modals, the fact that it is marked as questionable to begin the discourse with Speaker A’s utterance in (iii) suggests that it isn’t the right modal to assume is underlying Why-VP in (i). Since could seems more natural in (ii), we might think that could is also a possible interpretation for Why-VP. It seems to me that would is also a possible paraphrase of (i), and would be a fine way to begin the discourse. An anonymous reviewer finds why don’t you just get a drink out of the machine to be the most natural paraphrase of (i) above. I direct the reader to Francez (2017), who explores the connection between Why-not-VP and why-don’t-you clauses, which he terms “suggesterrogatives.” [^]
  37. A reviewer finds naturally occurring data suggesting that have is possible in Why-VP:
    (i) (COHA Corpus)
      But why have given us empty symbols? Why not a little fact?
    (ii) (forums.androidcentral.com/samsung-galaxy-note-8/879716-cant-assign-tones-contacts-anymore.html)
      And if I can just do it that simply, why have taken it out to begin with? Numb skulls.
    (iii) (loveletterstoinanimateobjects.tumblr.com)
      Why have gotten so pushy lately? You were here right after Halloween this year and you were wondering why I was ignoring you.
    It’s not clear to me nor to others whether these are actually well-formed. It would be a welcome outcome if they were found to be well-formed as this would make Why-VP less mysterious and closer in interpretive possibilities to its finite modal paraphrases, which can occur with the perfect auxiliary have. [^]
  38. Note that the incompatibility with have and be in (94) and (95) cannot be a property of the infinitival status of the clause, given that infinitival clauses are able to host these aspectual items:
    (i) I want/need/would like to have finished my homework by 5pm.
    (ii) I want/need/would like to be making dinner by then.
  39. Condoravdi appeals to the diversity condition to ensure that metaphysical modals not get a past temporal orientation with have. The diversity condition is a felicity condition on modal bases that mandates that the modal base be diverse with respect to the modal’s prejacent p – that is, the modal base contains both p and not p. A metaphysical modal requires that the facts of the world up to the evaluation time of the modal be settled – have cannot contribute a past orientation for this modal base because the diversity condition will require both p and not p to be in the modal base, which is, in turn incompatible with the requirements of the metaphysical modal base, i.e., p must be settled. If p is settled, and we are evaluating the status of p in the past, then the diversity condition will not be satisfied. However, have can shift the temporal perspective of a metaphysical modal to the past from which point, the outcome of p was not yet settled; thus, the diversity condition is fulfilled. Thomas (2014) argues that the diversity condition is irrelevant to the modal base of priority flavors – being diverse with respect to the prejacent does not matter for its felicity. If this is true, we cannot use the diversity condition to understand the impossibility of past orientation in Why-VP. [^]
  40. Thomas (2014) notes that deontic must has a similar restriction – deontic must cannot have a past orientation, while other deontic modals (e.g., should) can. Ninan (2005) argues that must, unlike other priority modals, has a performative dimension, which may prevent a past orientation – one cannot impose an order on someone to do something in the past. Ninan wonders whether this view is tenable given that there do seem to be non-performative uses of must and the ban on the past orientation remains. [^]
  41. A reviewer finds that yesterday seems to be possible in the following Why-VP example:
    (i) 6 Nations was in Feb so why only announce this yesterday?
    This is certainly better than (96c), but I’d argue that this comes out of an interesting use of temporal only, which seems to help orient this in the past. Without only, it is my intuition that the above example becomes more marked: 6 Nations was in Feb so why announce this yesterday? If it is indeed possible to orient Why-VP in the past, it becomes more and more like its modal paraphrase, a sign that we are on the right track. [^]
  42. A reviewer points out a similar pattern in bare infinitives identified by Takezawa (1984):
    (i) Tina saw Bill be devoured by a ghoul.
    (ii) *Tina saw Bill have entertained the class./*Tina saw Bill be entertaining the class.
    Takezawa observes a whole class of predicates like see (e.g., make, let, hear, smell, rather than) that take these bare infinitives and none of them allow perfective have. It may be true that the same type of clause is selected by why in Why-VP. Crucially, however, the bare infinitives above do not seem to have the modal quality observed for Why-VP. [^]


Thanks are due first to Jim McCloskey, whose feedback was invaluable at each stage of the project, particularly as I made my way through many drafts of this paper. Thanks also to three anonymous reviewers whose comments and suggestions improved the paper a great deal. As members of my MA committee, Pranav Anand and Jorge Hankamer provided key insights that led to new developments in the project. This work benefited immensely from conversations that happened at the University of Michigan, S-Cirle at UCSC, CLS 54, and S-lab at UMD – in particular, with Donka Farkas, Itamar Francez, Valentine Hacquard, Jeff Lidz, Jason Merchant, Acrisio Pires, Deniz Rudin, and Alexander Williams. The research reported here was supported, in part, by NSF Award #1451819 to the University of California Santa Cruz (Pranav Anand PI, Daniel Hardt and James McCloskey co-PI’s).

Funding Information

The research reported here was supported, in part, by NSF Award #1451819 to the University of California Santa Cruz (Pranav Anand PI, Daniel Hardt and James McCloskey co-PI’s).

Competing Interests

The author has no competing interests to declare.


Baltin, Mark R. 1995. Floating quantifiers, PRO, and predication. Linguistic Inquiry, 199–248.

Barrie, Michael. 2007. Control and wh-infinitivals. In New horizons in the analysis of control and raising, 263–279. Springer. DOI:  http://doi.org/10.1007/978-1-4020-6176-9_12

Bhatt, Rajesh. 1998. Argument-adjunct asymmetries in rhetorical questions. Handout for a talk presented at NELS29 at the University of Delaware.

Bhatt, Rajesh. 1999/2008. Covert modality in non-finite contexts. Philadelphia, PA: University of Pennsylvania dissertation. Revised version published by de Gruyter, 2008. DOI:  http://doi.org/10.1515/9783110197341

Bobaljik, Jonathan. 2003. Floating quantifiers: Handle with care. In Lisa Cheng & Rint Sybesma (eds.), The second glot international state-of-the-article book: The latest in linguistics, 107–148. Berlin: Mouton de Gruyter. DOI:  http://doi.org/10.1515/9783110890952.107

Bromberger, Sylvain. 1992. On what we know we don’t know: Explanation, theory, linguistics, and how questions shape them. Chicago, IL: University of Chicago Press.

Chapman, Cassandra & Ivona Kučerová. 2016. Structural and semantic ambiguity of why questions: An overlooked case of weak islands in English. In Proceedings of the Linguistic Society of America 1. 1–15. DOI:  http://doi.org/10.3765/plsa.v1i0.3713

Chomsky, Noam. 1995. Bare phrase structure. In Hector Campos & Paula Kempchinsky (eds.), Evolution and revolution in linguistic theory: Studies in honor of Carlos P. Otero, 51–109. Washington, DC: Georgetown University Press.

Collins, Chris. 1991. Why and how come. In Lisa L. S. Cheng & Hamida Demirdache (eds.), More papers on wh-movement: MIT working papers in linguistics 15. 31–45. Cambridge, MA: Cambridge University Press.

Condoravdi, Cleo. 2002. Temporal interpretation of modals: Modals for the present and for the past. In David Beaver, Stefan Kaufmann, Brady Clark & Luis Casillas (eds.), The Construction of Meaning, 59–88. Stanford, CA: CSLI Publications.

Duffley, Patrick J. 1994. Need and dare: The black sheep of the modal family. Lingua 94. 213–243. DOI:  http://doi.org/10.1016/0024-3841(94)90010-8

Duffley, Patrick J. & Peter Enns. 1996. Wh-words and the infinitive in English. Lingua 98(4). 221–242. DOI:  http://doi.org/10.1016/0024-3841(95)00028-3

Duffley, Patrick J. & Pierre Larrivée. 1998. Need, dare and negative polarity. Linguistic Analysis 28. 1–19.

Francez, Itamar. 2017. Suggesterrogatives. Questioning Speech Acts Workshop, Konstanz. Handout.

Gordon, D. & G. Lakoff. 1971. Conversational postulates. In Proceedings of the the 7th Annual Meeting of the Chicago Linguistics Society, 63–84. Chicago, IL: Chicago Linguistics Society, University of Chicago.

Graff, David & Christopher Cieri. 2003. English Gigaword LDC2003T05. Web Download. Philadelphia: Linguistic Data Consortium.

Green, Georgia M. 1975. How to get people to do things with words: The whimperative question. In Jerry Morgan & Peter Cole (eds.), Syntax and semantics 3: Speech acts, 107–41. New York, NY: Academic Press.

Guerzoni, Elena & Yael Sharvit. 2007. A question of strength: on NPIs in interrogative clauses. Linguistics and Philosophy 30(3). 361–391. DOI:  http://doi.org/10.1007/s10988-007-9014-x

Hackl, Martin & Jon Nissenbaum. 2012. A modal ambiguity in for-infinitival relative clauses. Natural language semantics 20(1). 59–81. DOI:  http://doi.org/10.1007/s11050-011-9075-9

Hacquard, Valentine. 2006. Aspects of modality. Cambridge, MA: Massachusetts Institute of Technology dissertation.

Heim, Irene. 1997. Predicates or formulas? Evidence from ellipsis. In Aaron Lawson (ed.), SALT VII, Proceedings from Semantics and Linguistic Theory VII, 197–221. Ithaca, NY: Cornell University, CLC Publications. DOI:  http://doi.org/10.3765/salt.v7i0.2793

Hofmann, Lisa. 2018. Why not: Polarity ellipsis and negative concord. Manuscript, Department of Linguistics, University of California Santa Cruz.

Huang, C.-T. James. 1993. Reconstruction and the structure of VP: Some theoretical consequences. Linguistic Inquiry. 103–138.

Jedrzejowski, Łukasz. 2014. Again on why. but why. In Cassandra Chapman, Olena Kit & Ivona Kucerová (eds.), Formal Approaches to Slavic Linguistics: The McMaster meeting 2013, 22. 151–169. Ann Arbor, MI: Michigan Slavic Publications.

Johnson, David E. 1975. Why delete tense? Linguistic Inquiry 6(3). 481–489.

Kratzer, Angelika. 1991. Modality. In A. von Stechow & D. Wunderlich (eds.), Semantics: An international handbook of contemporary research, Berlin: de Gruyter.

Kuno, Susumu. 1987. Functional syntax: Anaphora, discourse and empathy. Chicago, IL: University of Chicago Press.

Landau, Idan. 2013. Control in generative grammar: A research companion. Cambridge, MA: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781139061858

Lasnik, Howard. 1992. Two notes on control and binding. In R. Larson, S. Iatridou, U. Lahiri & J. Higginbotham (eds.), Control and grammar, 235–251. Netherlands: Springer. DOI:  http://doi.org/10.1007/978-94-015-7959-9_7

Merchant, Jason. 2001. The syntax of silence: Sluicing, islands, and the theory of ellipsis. Oxford: Oxford University Press.

Merchant, Jason. 2002. Swiping in Germanic. In Werber Abraham & C. Jan-Wouter (eds.), Studies in comparative Germanic syntax, 289–315. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/la.53.18mer

Merchant, Jason. 2005. Fragments and ellipsis. Linguistics and Philosophy 27(6). 661–738. DOI:  http://doi.org/10.1007/s10988-005-7378-3

Ninan, D. 2005. Two puzzles about deontic necessity. MIT Working Papers in Linguistics (New work in modality) 51. 149–78.

Pesetsky, David. 1997. Some optimality principles of sentence pronunciation. In Pilar Barbosa, Danny Fox, Paul Hagstrom, Martha Mcginnis & David Pesetsky (eds.), Is the best good enough, 337–383. Cambridge, MA: MIT Press.

Pollard, Carl & Ivan A. Sag. 1992. Anaphors in English and the scope of binding theory. Linguistic Inquiry 23(2). 261–303.

Radford, Andrew. 2004. Minimalist syntax: Exploring the structure of English. Cambridge, MA: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511811319

Reinhart, Tanya & Eric Reuland. 1993. Reflexivity. Linguistic Inquiry 24(4). 657–720.

Rizzi, Luigi. 2001. On the position “int(errogative)” in the left periphery of the clause. In Guglielmo Cinque & Giampaolo Salvi (eds.), Current studies in Italian syntax: Essays offered to Lorenzo Renzi, vol. 59 (North Holland Linguistic Series: Linguistic Variations), 287–296. Leiden: Brill.

Rooth, Mats. 1992. Ellipsis redundancy and reduction redundancy. In Steve Berman & Arild Hestvik (eds.), Proceedings from the Stuttgart Ellipsis Workshop, vol. 340 (Arbeitspapiere des Sonderforschungsbereichs), 1–26. Heidelberg: Universität Stuttgart.

Ross, John R. 1969. Guess who. In Robert I. Binnick, Alice Davison, Georgia M. Green & Jerry L. Morgan (eds.), Proceedings of the 5th Annual Meeting of the Chicago Linguistic Society, 252–286. Chicago, IL: Chicago Linguistic Society, University of Chicago.

Rullmann, Hotze & Lisa Matthewson. 2018. Towards a theory of modal-temporal interaction. Language 94. 281–331. DOI:  http://doi.org/10.1353/lan.2018.0018

Sadock, J. 1974. Toward a linguistic theory of speech acts. New York, NY: Academic Press.

Schwarzschild, Roger. 1999. GIVENness, AVOIDF and other constraints on the placement of accent. Natural Language Semantics 7. 141–177. DOI:  http://doi.org/10.1023/A:1008370902407

Searle, John. 1975. Indirect speech acts. Syntax & Semantics, 3: Speech Act, 59–82. DOI:  http://doi.org/10.1017/CBO9780511609213.004

Shlonsky, Ur & Gabriela Soare. 2011. Where’s why? Linguistic Inquiry 42(4). 651–669. DOI:  http://doi.org/10.1162/LING_a_00064

Takahashi, Shoichi & Danny Fox. 2005. MaxElide and the re-binding problem. In Aaron Lawson (ed.), SALT XV, proceedings from Semantics and Linguistic Theory XV, 223–240. Ithaca, NY: Cornell University, CLC Publications. DOI:  http://doi.org/10.3765/salt.v15i0.3095

Takezawa, Koichi. 1984. Perfective have and the bar notation. Linguistic Inquiry, 675–687.

Thomas, Guillaume. 2014. Circumstantial modality and the diversity condition. In Proceedings of Sinn und Bedeutung 18.

Weir, Andrew. 2014. Why-stripping targets voice phrase. In Proceedings of NELS 43. 235–248. Amherst, MA: GLSA.

Yoshida, Masaya, Chizuru Nakao & Iván Ortega-Santos. 2015. The syntax of Why-stripping. Natural Language & Linguistic Theory 33(1). 323–370. DOI:  http://doi.org/10.1007/s11049-014-9253-9

Zyman, Erik. 2018. Phase-constrained obligatory late adjunction. Under review. Available at: http://people.ucsc.edu/~ezyman/publications.html.