1 Introduction

Negation is a fervently-debated phenomenon within formal syntactic enquiry, as it is a universal property of language that is, at the same time, highly variable in terms of how it is expressed (Mazzon 2004: 94). The variability of negation also makes it an intriguing object of study for variationist sociolinguists, given their interest in the factors that condition speakers’ choices between linguistic forms that convey the same meaning (Tagliamonte 2006). Formal syntactic theory and variationist sociolinguistics are often depicted as somewhat incompatible “opposites”, because of their respective focus on linguistic competence and general linguistic principles on the one hand, versus performance and variability on the other (Wilson and Henry 1998). However, recent studies have advocated bridging the gap between the two in analysing morpho-syntactic variation and change (Wilson and Henry 1998; Adger and Smith 2005; 2010; Cornips and Corrigan 2005a; b; Adger and Trousdale 2007; Buchstaller et al. 2013; Burnett et al. under review), with Barbiers (2005: 235) suggesting that “it is the task of sociolinguists to describe and explain the patterns of variation that occur within a linguistic community given the theoretical limits of this variation uncovered by generative linguistics [emphasis mine]”. This paper takes such an approach and argues that integrating formal theory into variationist analysis of morpho-syntactic variation allows for: (i) more careful delimitation of the linguistic variable and its contexts, taking into consideration the constraints of the grammar; (ii) theoretically-informed decision-making as to the inclusion and exclusion of tokens; and (iii) using production data to test hypotheses that can elucidate how variants are derived from the grammar. Adopting this approach, my investigation focuses specifically on the alternation between not-negation (1), no-negation (2) and negative concord (3) in English.1

(1) Not-negation
  I didn’t see anybody.
(2) No-negation
  I saw nobody.
(3) Negative concord2
  I didn’t see nobody.

Negative concord is among the most-studied morpho-syntactic phenomena in variationist sociolinguistics (e.g. Labov 1972a; Smith 2001; Anderwald 2002; Kortmann and Szmrecsanyi 2004; Anderwald 2005; Szmrecsanyi 2013), perhaps because of its ubiquity across non-standard Englishes worldwide (Chambers 2012). Such studies typically examine the presence versus absence of negative concord, with little (if any) attention paid to whether not- or no-negation are used instead. Yet, as (1)–(3) show, there are contexts in which all three forms are semantically equivalent and are therefore variants of a single variable, i.e. “alternative ways of “saying the same thing”” (Labov 1972b: 94). Other scholars have investigated only variation between not-negation and no-negation, sometimes because they analysed Standard English which does not have negative concord (Tottie 1991a; b; Varela Pérez 2014) or because of its low frequency (Harvey 2013; Burnett et al. under review). Although Childs et al. (2015; under review) set out to analyse all three variants in their corpus-based sociolinguistic comparison of the variation in North East England, Yorkshire and Ontario, Canada, the infrequency of negative concord meant that it could not form a major part of their study.

This paper presents the arguments for analysing the variation as consisting of three variants and outlines two syntactic accounts of how the variants are derived. Under Account 1, the three variants have the same underlying structure featuring a negative marker/operator in NegP with which the n-words in no-negation and negative concord agree (by extending Zeijlstra 2004). Account 2 posits a different structure for no-negation, where negation is marked inside the post-verbal indefinite DP and moves to NegP for sentential scope (based on Kayne 1998; Svenonius 2002; Zeijlstra 2011). With appeal to standard assumptions that BE and (optionally) HAVE raise for tense and agreement while lexical verbs do not (Pollock 1989), these two accounts make different predictions about the distribution of variants according to verb type and verb complexity, as explained in Section 2. To ascertain the robustness of the constraints across dialects, these hypotheses are tested in data extracted from informal conversations in corpora from three Northern UK localities: Glasgow, Scotland (Sounds of the City, Stuart-Smith and Timmins 2011–2014), Tyneside, North East England (Diachronic Electronic Corpus of Tyneside English, Corrigan et al. 2010–2012) and Salford, Greater Manchester (Research in Salford English corpus, Pichler 2011–2012).

Using spoken corpora enables the consideration of a further factor that may affect the variation: discourse status, i.e. whether the negative expression relates to a discourse-old proposition, or provides discourse-new information. Discourse status has been identified as contributing to variation in the use of negative markers in Romance languages (Schwenter 2005; 2006; Hansen 2009; Hansen and Visconti 2009) and in English (Tottie 1991b; Wallage 2013; 2015; 2017). In Present-Day English, no-negation is associated with introducing new information, while not-negation is typically used when the proposition is discourse-old – a constraint which has persisted since Early Middle English (Wallage 2015; 2017). As discourse-new information is typically introduced post-verbally (Ward and Birner 2008), Wallage’s finding may suggest that no-negation is syntactically-marked in a post-verbal position. Analysing discourse status will therefore generate additional evidence to establish whether Account 1 (where all variants mark negation syntactically in NegP) or Account 2 (where no-negation is marked in a post-verbal DP) offers a more comprehensive theory of the variation.

2 The syntax of negation with indefinites in English

As Standard English does not allow negative concord, sentences like he didn’t see nothing receive a double negation interpretation (‘he saw something’). Within Jespersen’s Cycle (Jespersen 1917), English is currently in transition from a double negative system towards a negative concord system and is “underlyingly an NC [negative concord] language” (Zeijlstra 2004: 145), as supported by a range of evidence. Firstly, “all languages with a preverbal negative marker are NC [negative concord] languages” (Zeijlstra 2004: 145). English has a pre-verbal negative marker n’t (pre-verbal because it attaches to the finite verb) and thus we expect negative concord to be possible (Zeijlstra 2004: 145). Secondly, negative polarity items (NPIs) of the form any- that occur with not-negation behave similarly to n-words that appear in negative concord in other languages (e.g. French personne), which function as “an indication for the hearer that the expression is negative” (Zeijlstra 2004: 278–279). Thirdly, negative concord is widespread in non-standard Englishes and even speakers who use it near-categorically can style-shift to another variant (Labov 1972a: 806). In contrast, double negation is rare and may require an additional focus operator on the indefinite (Biberauer and Roberts 2011; Blanchette 2013). Negative concord therefore appears to be part of the grammar of English but is not realised in standard varieties because of external standardisation pressures (Weiß 2002: 138; Blanchette 2013).

If negative concord is generated in the syntax of English and it can be semantically-equivalent to not-negation and no-negation (see (1)–(3)), it is conceivable that all three variants have the same structure with only one syntactic negation, in NegP. This is the tenet of Account 1 of the variation. Alternatively, negative indefinites of the form no- could be licensed either by NegP (negative concord) or in the DP (no-negation), which is the basis for Account 2. These accounts are presented in Sections 2.1 and 2.2, then tested empirically in Section 6.

2.1 Account 1

Account 1 is based on Zeijlstra’s (2004) theory which captures the distinction between strict and non-strict negative concord. Non-strict negative concord languages, e.g. Spanish, Italian and most non-standard varieties of English, allow the use of n-words without an additional negative marker (Zeijlstra 2004: 145; Penka 2011: 17). Zeijlstra (2004) proposes that in these languages the negative marker (e.g. English not/n’t) has an interpretable negative feature [iNEG] and the n-words that appear in negative concord (e.g. he didn’t see nobody) are not inherently negative. These n-words have an uninterpretable negative feature [uNEG] which must enter into an Agree relation with the c-commanding negative marker in SpecNegP, as in (4a), for [uNEG] to be deleted (Zeijlstra 2004: 237). On the other hand, strict negative concord requires no-forms to co-occur with a negative marker, e.g. in African American Vernacular English (AAVE, Labov 1972a: 786) and Greek, Hungarian and Slavic languages (Giannakidou 2012: 330). As (4b) shows, the derivation of strict negative concord proceeds in the same way as for non-strict concord except that the negative markers have [uNEG] and are licensed by a c-commanding covert [iNEG] operator (Zeijlstra 2004: 249).

(4) Adapted from Zeijlstra (2004: 258)
  a. Non-strict negative concord b. Strict negative concord
     

Extending Zeijlstra’s (2004) framework to not-negation and no-negation would depict these variants as containing a negative operator in NegP with an underlying indefinite NPI in the predicate. The indefinite NPI is a free variable requiring existential closure (Zeijlstra 2004: 237; Biberauer and Roberts 2011). If the indefinite and the operator agree, a no-form is spelled out (5b–c), but otherwise the default spell-out is the NPI (5a).

(5) Account 1: The three variants3
  a. Not-negation b. No-negation4 c. Negative concord
       

Previous corpus-based investigations have found that no-negation is favoured with BE/HAVE while not-negation is favoured with lexical verbs (Tottie 1991a; b; Varela Pérez 2014; Childs et al. 2015; under review; Wallage 2017). Sometimes this has been attributed to BE/HAVE having higher frequency than individual lexical verbs, making the former more resistant to change and thus less likely to take the historically-newest variant, not-negation (Tottie 1991a; b; Varela Pérez 2014).5

Harvey (2013) alternatively suggests that these verb type effects on the variation could arise because lexical verbs do not raise for tense and agreement (see Pollock 1989), which is pertinent to other morpho-syntactic phenomena such as do-absence (Smith 2000). In an example like you have nobody, Harvey assumes that have moves to I, no is in SpecNegP and body remains low in the DP. In contrast, sentences like you don’t see anybody have do-support and the lexical verb remains in the VP. Harvey (2013) suggests that no-negation is more difficult to derive for the latter because the lexical verb interferes between the negative marker and the DP. Burnett and Tagliamonte (2016a) and Burnett et al. (under review) similarly appeal to structural factors in accounting for not-/no-negation variation in Toronto English, suggesting that the indefinite can be in two positions: the “higher domain” (NegP and higher) or the “lower domain” (below NegP). The authors categorise cases where the indefinite is embedded in some way (e.g. with a lexical verb, in a PP) as residing in the lower domain, while other tokens of indefinites were classed as having potential to be in the higher domain. Where the indefinite had potential to occur in the higher domain, no-negation was near-categorically preferred over not-negation; among indefinites classed as residing in the lower domain, only 6.3% were no-negation (Burnett and Tagliamonte 2016a). Structural adjacency has similarly been found to promote the use of no-negation over not-negation in British varieties of English (Burnett and Tagliamonte 2016b) and to promote the use of single n-words over negative concord in Montréal French (Burnett et al. 2015).

The movement of BE/HAVE versus lexical verbs and structural adjacency between the negator and the indefinite as discussed above describe similar kinds phenomena. Table 1 provides examples of the variants with different verb combinations, indicating whether there is adjacency between the sentential NegP and the indefinite. HAVE is excluded from Table 1 because its raising is optional, meaning it can behave similarly to either BE or lexical verbs depending on the circumstances.

Example with not-negation Example with no-negation Example with negative concord Structural adjacency between sentential NegP and indefinite?

BE without additional auxiliaries It wasn’t any particular amount It was no particular amount It wasn’t no particular amount Yes
BE + additional auxiliaries He wouldn’t be any bother He would be no bother He wouldn’t be no bother No
Lexical without additional auxiliaries He didn’t see anybody He saw nobody He didn’t see nobody No
Lexical + additional auxiliaries He couldn’t see anybody He could see nobody He couldn’t see nobody No

Table 1

Constraints on the variation.

The aforementioned structural effects on the variation (Harvey 2013; Burnett and Tagliamonte 2016a; b; Burnett et al. under review) can be captured under Account 1 presented in this paper. Account 1 appeals to Agree, a relation that can be disrupted when there is intervening material or greater syntactic distance between a target and controller (Pietsch 2005: 129; Corbett 2006: 235–236; Buchstaller et al. 2013; Childs 2013). Furthermore, my study’s inclusion of a third variant enables an additional prediction to be made about English negative concord. Under Account 1, since no-negation and negative concord are derived through Agree between syntactic negation in NegP and the lower indefinite(s), both variants are expected to pattern alike in their distribution with main verbs which raise to I (BE and optionally HAVE) versus those that do not raise (lexical verbs). Assuming the same mechanism, since constructions with auxiliary verbs feature a main verb (regardless of type) that remains in the VP, these too are hypothesised to have comparatively lower rates of no-negation and negative concord than constructions without auxiliaries.

To take example (6a), BE must raise to I for tense and agreement and the lower copy is deleted at PF. Lexical verbs like see remain in V, shown in (6b), since their tensed forms are selected from the lexicon and their features are checked against those in I only at LF. As saw resides between the operator and the indefinite in (6b) (material not present in (6a)), the Agree relation is expected to be more difficult to obtain in (6b) than (6a).6

(6) Account 1: No-negation with BE and lexical verbs
  a. BE, e.g. You are nothing like your Dad. b. Lexical, e.g. He saw nobody.
     

2.2 Account 2

Account 2 contrasts with Account 1 in that no-negation is derived differently from the other two variants. Under Account 2, no-negation is the result of syntactic negation within the indefinite DP, followed by movement to the sentential NegP projection (Kayne 1998; Svenonius 2002; Zeijlstra 2011).7 These negative DPs could be considered inherently-negative quantifiers (e.g. Haegeman 1995; Watanabe 2004; Wallage 2017: 185) or composed of a negative operator plus an indefinite (e.g. Zeijlstra 2011; Penka 2012; Tubau 2016). Either of these DP structures are tenable for Account 2 and this does not matter for the purposes of my analysis (see Iatridou and Sichel 2011: 610–12), as the crucial property of no-negation in this account is that negation is marked syntactically within the DP. The ambiguity of constructions such as John would be happy with no job (from Rochemont 1978: 73) follows from Account 2 if we assume that the negative DP moves to NegP for sentential scope under the reading that there is no job with which John would be happy (sentential negation), but does not move under the reading that John would be happy if he did not have a job (constituent negation). Under Account 1, this is not straightforwardly captured and would likely require an additional focus operator as mentioned earlier in relation to double negation (Biberauer and Roberts 2011; Blanchette 2013).

If no-forms have DP-internal negation, how can we account for them appearing in negative concord, where they do not contribute negative meaning? A way of reconciling these facts is to propose that English n-words are ambiguous (Herburger 2001) or have two lexical entries (Déprez 1997; Tubau 2016). In other words, n-words can be inherently negative, as in no-negation, or lack syntactic negation, as in negative concord (Déprez 1997: 119; Tubau 2016). This kind of account is consistent with a language undergoing change from expressing double negation to expressing negative concord (Herburger 2001) which, as previously noted, is underway in English (Zeijlstra 2004: 146). If negative DPs can project their own syntactic negative operator, this would also account for their licensing as elliptical answers (e.g. Q: What did you buy? A: Nothing) and in clause-initial position (e.g. Nothing’s wrong) (Tubau 2016).8

The structure of not-negation, no-negation and negative concord under Account 2 is shown in (7).

(7) Account 2: The three variants
  a. Not-negation b. No-negation c. Negative concord
       

To take an example, in (8a), there is no material between the indefinite and the NegP, since BE has raised. In (8b), the lexical verb saw is in situ. The verb adds to the cost of the movement required to derive no-negation under this account, akin to Holmberg’s Generalisation (Holmberg 1999) whereby object shift in Scandinavian languages is dependent on prior movement of the verb. Indeed, Svenonius (2002) describes the movement of negative DPs in Norwegian in these terms. The variability in English is therefore consistent with cross-linguistic tendencies (see also Burnett and Tagliamonte 2016a; Burnett et al. under review).

(8) Account 2: No-negation with BE and lexical verbs
  a. BE, e.g. You are nothing like your Dad. b. Lexical, e.g. He saw nobody.
     

To summarise, Account 1 and Account 2 make different predictions about the distribution of variants. In Account 1, no-negation and negative concord are derived via Agree in the same manner and are both expected to be dispreferred with lexical (compared to non-lexical) verbs and in constructions with auxiliaries (compared to those without). In Account 2, no-negation involves a negatively-marked DP which moves for sentential scope, while negative concord features an n-word that is not syntactically negative but agrees with a negative marker in a higher NegP. As movement is more costly than Agree (Chomsky 2000: 101–102), under Account 2 no-negation is expected to be dispreferred in the same contexts as in Account 1 (with lexical verbs and in constructions with additional auxiliary verbs). However, unlike Account 1, Account 2 does not predict that no-negation will pattern akin to negative concord, because the two variants are derived by different mechanisms.9

3 Corpora and samples

The hypotheses associated with Accounts 1 and 2 were tested in corpora of English spoken in Glasgow, Tyneside and Salford. These locations, shown in Figure 1, are ideal for comparative analysis: they share similar socio-economic backgrounds as large urban centres and their regional varieties have relatively low prestige (Coupland and Bishop 2007).

Figure 1 

Map of localities.10

The corpora, the Glasgow Sounds of the City corpus (Stuart-Smith and Timmins 2011–2014), the Diachronic Electronic Corpus of Tyneside English (Corrigan et al. 2010–2012) and the Research on Salford English corpus (Pichler 2011–2012), contain recordings of informal conversation with native speakers of the dialects. Although the corpora include speakers with a range of backgrounds and ages, an essential part of cross-corpus work is to maximise comparability between datasets (D’Arcy 2011). Speakers were therefore selected from each corpus in a principled way, as shown in Table 2. Only working-class speakers (defined using corpus metadata) were selected, since the Tyneside and Salford corpora contain few or no middle-class speakers and working-class speakers tend to use non-standard variants (e.g. negative concord) to a greater extent (Labov 2006). As the Sounds of the City speakers were aged 13–15 and 40–60 (with no individual ages available), these age ranges were used as a guide for choosing speakers from the other corpora, to form distinct “younger” vs. “older” groups. An exact match with the Glasgow data was not possible because DECTE/RoSE do not include 13–15 year-olds and DECTE has a low percentage of 40–60 year-olds. The age ranges therefore had to be expanded to obtain enough speakers.

Recording set-up Demographic Recording Years   Ages   Social Class

Glasgow
Sounds of the City
Same-sex pairs, without an interviewer Born, raised and living in the Maryhill area (Stuart-Smith et al. 2007: 230) 1997, 2003 13–15
40–60
Working-class
Tyneside
DECTE
Same-sex pairs, with an interviewer Born, raised and living in Newcastle upon Tyne, Gateshead or North Tyneside 2007–2011 18–25
43–78
Working-class
Salford
RoSE
Same-sex pairs, sometimes with an interviewer Born, raised and living in the metropolitan area of Salford, Greater Manchester11 2011–2012 17–27
38–63
Working-class

Table 2

Overview of sample demographic.

Since corpora are constructed with different research questions in mind (Tognini-Bonelli 2001: 59), inevitably there are some inconsistencies between datasets. However, as Table 3 shows, the number of speakers is consistently higher than the recommended 5 per cell (Meyerhoff et al. 2015: 22). Although the age ranges differ between communities, there is a clear distinction between the “younger” and “older” groups in each locale, as shown by their average ages (calculable for Tyneside and Salford, where exact ages are known).

Locality Age Sex Total

M F

Glasgow Younger
13–14
10 10 20
Older
40–60
10 10 20
Total 40
Tyneside Younger
18–25
(Average 20.7)
12 9 21
Older
43–78
(Average 58.8)
6 7 13
Total 34
Salford Younger
17–27
(Average 21.7)
6 6 12
Older
38–63
(Average 50.8)
9 12 21
Total 33

Table 3

Final sample.

4 The variable context and data extraction

Not-/no-negation and negative concord as defined earlier require an underlying any- NPI which permits all three variants with semantic equivalence. Constructions with only a constituent negation reading (e.g. John went to the cinema not on Monday but on Tuesday) do not form part of the variable context because the alternative variants are either not licensed or do not have the same meaning. Indefinites must be in the predicate, and not-negation and negative concord feature a negative marker in NegP, namely not, n’t, no’ (an equivalent to not in Glasgow) or a negative auxiliary. The negative auxiliaries include both standard and non-standard forms, with the latter comprising cannit (‘can’t’) and divn’t (‘don’t’) in Tyneside and verbs with -nae (e.g. dinnae) in Glasgow. Non-standard indefinite forms owt (‘anything’) and nowt (‘nothing’), found in the Tyneside and Salford data, were also included. Table 4 shows the canonical forms for each variant which comprise the variable context. Notever and never were excluded, because where variation is possible, never was preferred 97–100% of the time in each dataset (see also Tottie 1991b: 109; Varela Pérez 2014: 337).

Not-negation No-negation Negative concord

not … any no, none not … no/none
not … anybody nobody not … nobody
not … anyone noone not … noone
not … anything nothing not … nothing
not … anywhere nowhere not … nowhere

Table 4

Forms within the variable context.

Tottie (1991a; b) and Varela Pérez (2014) included a, an and zero determiners in their analyses as equivalent to any (in not-negation) and no (in no-negation), e.g. I didn’t see a car. These are excluded from my variable, following the arguments originally set out in Childs et al. (2015) and Childs (2016) which are summarised in (i)–(iii):

  1. a/an/ø are neither semantically nor syntactically equivalent to any. Only the latter is an NPI, and it expresses “a kind of extreme non-specificity” (Lyons 1999: 37) or emphatic quality not expressed by the other items (Tottie 1991b: 305; Jackson 1995: 185).
  2. Negative concord rarely applies to a and an (Labov 1972a: 806; Cheshire 1982: 66; Smith 2001: 131). While its occurrence (albeit rare) could be deemed evidence that these should be included in the variable context (Howe 2005), Labov (1972a: 810–811) argues that those exceptions arise because any is inserted prior to negative concord applying.
  3. No is overwhelmingly considered equivalent to not any (Quirk et al. 1985: 782; Tieken-Boon van Ostade 1997: 188; Anderwald 2002; 2005; Peters 2008; Peters and Funk 2009; Wallage 2015: 214, 2017). Although Tottie (1991b) included a/an/ø in her sample, she observes that when variation between not-negation and no-negation is possible, not- negation sentences tend to have any and no-negation sentences generally correspond to any.

Given (i)–(iii), only any- and no- forms were extracted from the corpora, using AntConc (Anthony 2011). This ensured that all three variants were captured. Orthographic variants were included in the search (e.g. nae, nee) and I checked the correspondence between the audio and transcripts. Tokens outside the variable context were removed, e.g. pre-verbal indefinites (e.g. no one’s there) that have no semantically-equivalent not-negation alternative.

Some token types had to be excluded due to lack of semantic equivalence, as explained in Childs et al. (2015). As negation with indefinites is subject to clause-bound constraints (Zeijlstra 2004: 264), I excluded negative-raising contexts (e.g. think, want) and cross-clausal negation where subtle changes in meaning arise depending on the position of the negative marker in relation to the indefinite, as in (9). General extenders as in (10) were excluded, where negation has previously been analysed as licensed in a separate clause (Labov 1972a: 806).

(9) a. I don’t think anyone was hurt [P/416, Tyneside]
  b. I think no one was hurt
  c. I don’t think no one was hurt
(10) they hadnae even washed the floor or nothing [NKOF1, Glasgow]

Negated adjectives were excluded since the variants are not semantically equivalent, e.g. (11b) expresses a greater intensity of ‘good’ than (11a).

(11) a. it doesn’t look good for a Christian woman [SG/121, Tyneside]
  b. it looks no good for a Christian woman

Tokens featuring adverbs were also excluded, because the position of the adverb relative to the scope of negation changes the meaning. For example, (12a) is a hedged statement but (12b) “emphasiz[es] the subjective judgement of the importance of the situation involved in the proposition in question” (Paradis 2003: 194). Some adverbs cannot occur in the same syntactic position with not-negation (13b) as compared to no-negation (13a).

(12) a. we’d not really done anything wrong [Helen, Salford]
  b. we’d really done nothing wrong
  c. *we’d done really nothing wrong
(13) a. you pay virtually nothing [B/145, Tyneside]
  b. *you don’t pay virtually anything

Childs et al. (2015; under review) included semantically-equivalent tokens licensed within PPs, e.g. without any and with no, to establish whether these patterned like verbal negation. As the present paper tests hypotheses concerning the syntactic marking of negation rather than other NPI-licensing contexts, PP constructions are excluded from this analysis.

Fixed phrases (14) and utterances with an elided subject (15) were also excluded because of their lack of variability.

(14) well it’s better than nowt [Mary, Salford]
(15) nae point in me going up unless it was a Friday [00-G1-m03, Glasgow]

Following standard variationist sociolinguistic practice (Tagliamonte 2006), ambiguous tokens or those that were used in false starts or direct quotes were also removed.

The final number of tokens per locality is as follows: Glasgow (N = 154); Tyneside (N = 200); Salford (N = 143).

5 Coding

As defined earlier, the dependent variable was coded as not-negation, no-negation or negative concord. In addition to locality (Glasgow, Tyneside or Salford), the tokens were coded for the following factors.

5.1 Verb type

Section 2 detailed how verb type affects the choice of not, no or negative concord and how this can help test the adequacy of Account 1 or 2. Verb type was coded as in (16). “Existentials” (16a) are a construction type rather than a verb type, but were separated from other types of BE (16b) which have lower propensities for no-negation (Tottie 1991a; b; Varela Pérez 2014; Childs et al. 2015; under review). HAVE (16c) and HAVE GOT (16d) were distinguished because the latter may behave as an auxiliary + main verb (Berdan 1980: 388). Main verb DO (16e) and other lexical verbs (16f) were separated in case DO’s alternative function as an auxiliary affects its distribution as a main verb.12

(16) a. Existentials
    there was nothing to do [MS/321, Tyneside]
  b. BE
    it’s naewhere near Easterhouse [4M5, Glasgow]
  c. HAVE
    they didn’t have any positions available [SM/135, Tyneside]
  d. HAVE GOT
    he’s got no money [Amanda, Salford]
  e. DO
    I’m not doing anything wrong [00-G2-m03, Glasgow]
  f. Lexical verbs
    well that doesn’t mean nowt [PM/85, Tyneside]

5.2 Complexity of the verb structure

“Simple” verb structures occur with no-negation more than “complex” verb phrases do (Tottie 1991b: 224; Varela Pérez 2014: 374; Burnett and Tagliamonte 2016a; b; Wallage 2017; Burnett et al. under review). My coding of complexity of the verb structure firstly distinguishes between existentials (17a) and HAVE GOT (17b), for the reasons explained in Section 5.1. A further category comprises the other “simple verbs” (17c), i.e. those that would have “simple present or past tense nonnegated forms” (see Tottie 1991b: 224), containing a main verb with or without do-support. Tottie (1991b: 224) codes a further category of complex sentences (those with “periphrastic structures”). I make additional distinctions within this group between constructions with a non-modal auxiliary (17d) and those with a modal or semi-modal auxiliary (17e). These constructions feature one such auxiliary between the subject and main verb. Within the latter group (17e), the semi-modals comprise five tokens of HAVE GOT TO and BE GOING TO.

(17) a. Existentials
    there’s no respect now [NKOM1, Glasgow]
  b. HAVE GOT
    but really, Salford hasn’t got any city centre, has it? [Paul, Salford]
  c. Simple verbs
    they don’t do anything in return [NKOF4, Glasgow]
  d. With non-modal auxiliary verb
    and then after that I’ve had no trouble [P/416, Tyneside]
  e. With modal or semi-modal auxiliary verb
    I won’t have any credit [Emily, Salford]

5.3 Discourse status

As noted earlier, discourse status also affects the variation: no-negation is associated with introducing new information and not-negation is typically used in relation to a discourse-old proposition (Wallage 2015; 2017). The tendency for new information to be introduced post-verbally as opposed to pre-verbally (Ward and Birner 2008) may suggest that no-negation, as a marker of new information, may be marked in a post-verbal syntactic position. Investigating the distribution of variants according to discourse status will therefore provide further evidence as to whether Account 1 (in which all variants have syntactic negation marked within the sentential NegP) or Account 2 (in which no-negation is marked within an object DP) is better supported.

My tokens were categorised according to the coding schema that Wallage (2013; 2015; 2017) applies to English, developed from investigations of negation in Romance languages (Schwenter 2005; 2006; Hansen 2009; Hansen and Visconti 2009). Tokens belong to one of five categories, of which the first four are “discourse-old”. These are illustrated in the following examples, sometimes situated within a longer extract for context. In these examples, text that is both bold and italicised represents the earlier proposition, while text that is only in bold is the token of the variable which was included in my analysis.

1. Denial of an antecedent proposition: “the negative proposition denies an earlier proposition which was explicitly stated in the discourse” (Wallage 2013: 5)
  Rebecca: Cos I- I’ll get paid won’t I, but (.) I’m gonna get emergency-taxed.
  Amanda: You won’t get nothing this month.
  Rebecca: Will I not?
  Amanda: I don’t think so. When d- when did you start?
  Rebecca: What date are we on today? The 21st?
  Amanda: Yeah.
  Rebecca: 21st, 20th, 18th, 17th on Monday, want it? 16th, 15th, 14th
  Amanda: Ah you might get a week (.) because (.) you- you get paid up un—
  Rebecca: About the 10th, I think.
  Amanda: You might get a week (.) cause you get paid from, eh up to the 17th.
  [Salford]  
2. Repetition of an antecedent proposition: “the negative proposition repeats an earlier proposition which was explicitly stated in the discourse” (Wallage 2013: 5)
  4F6: I’m gonnae go down there on my tod. I don’t know anybody.
  4F5: No. Don’t- don’t dae it!
  4F6: I know, I know. I’ll no dae it. I- I’ve just got to get it out my system.
  4F5: Aye.
  4F6: I’ve got to go and that’s it. I’m going on my own. That’s the reason I’m doing it.
  4F5: Aye.
  4F6: I’m not taking anybody with me.
  [Glasgow]
3. Cancellation of an inference: “the negative proposition cancels an implicature arising out of the preceding discourse” (Wallage 2013: 5)
  4F3: So, you coming to the Christmas lunch?
  4F4: I’ve no’ heard nothing about it yet.
  4F3: Well, it’s on the tenth of December.
  [Glasgow]
4. Assertion of an inference: “the negative proposition explicitly states a proposition which is implied by the preceding discourse” (Wallage 2013: 5)
  PM/85: I’m saying like the main toon (.) it’s all listed buildings you know, they cannit change anything.
  [Tyneside]
5. Discourse-new proposition: the negative proposition “is not identified by an antecedent proposition in the earlier discourse and is not inferentially linked to the preceding discourse” (Wallage 2013: 5)
  Fieldworker: What do you think about the way teenagers today sound?
  JR/456: Teenagers today?
  Fieldworker: When they talk English, what do you think about the way they sound?
  DK/131: There’s no discipline.
  [Tyneside]

6 Results of quantitative analysis

The relative frequency of not-negation, no-negation and negative concord differs significantly across the communities (χ2 = 26.64; d.f. = 4; p < 0.001). As Figure 2 shows, no-negation is most strongly preferred in Tyneside, followed by Glasgow, then Salford. The opposite ranking of localities pertains with respect to their rates of not-negation.

Figure 2 

Overall distribution of variants per locality.

The higher the rate of not-negation, the higher the rate of negative concord. Conversely, the frequencies of no-negation and not-negation do not correlate in this way. These findings are more compatible with Account 2 (over Account 1), in which not-negation and negative concord have the same structure with syntactic negative-marking in NegP, while no-negation has negative-marking in the DP. Constraints on the variation, examined next, will reveal more about whether Account 1 or 2 is the better fit.

6.1 Verb type

The results for verb type in Figure 3 corroborate previous findings for verb type (Tottie 1991a; b; Varela Pérez 2014; Childs et al. 2015; under review; Wallage 2017). Existentials exhibit the highest frequency of no-negation; BE and HAVE tend to take no-negation; and lexical verbs tend to take not-negation.

Figure 3 

Distribution of variants according to verb type, per locality.

The fact that no-negation is dispreferred with lexical verbs is expected under both Account 1 and Account 2: lexical verbs reside between the negative operator and the indefinite in the structure, which can disrupt Agree (Account 1) or make movement more costly (Account 2). However, the behaviour of negative concord allows us to distinguish between the two accounts. Although there is some low-frequency use of negative concord with BE/HAVE in Glasgow, negative concord is nevertheless used more often with lexical as opposed to functional verbs in all three locales, just like not-negation – exactly as expected under Account 2. In the BNC, Wallage (2017: 142) similarly found no statistically significant difference in the distributions of not-negation and negative concord according to verb type. These tendencies would be unexpected under Account 1, which predicted that negative concord would behave like no-negation.

HAVE GOT has an uncertain syntactic status as a semi-grammaticalised form (Quinn 2000). If GOT in HAVE GOT is a main verb, one would expect no-negation to be disfavoured in this context under both Accounts 1 and 2. Contrary to expectations, Figure 3 shows that HAVE GOT tends to take no-negation, like HAVE. GOT in HAVE GOT therefore appears to be more transparent to the Agree relation (Account 1) or movement (Account 2) required for no-negation than ordinary lexical verbs are, perhaps because GOT is “semantically void” in this context (Berdan 1980: 388).

6.2 Complexity of the verb structure

The results for complexity of the verb structure corroborate previous observations that constructions with additional auxiliary verbs have a greater propensity to take not-negation (Tottie 1991b: 224; Varela Pérez 2014: 374; Burnett and Tagliamonte 2016a; b; Wallage 2017; Burnett et al. under review). Cross-tabulating complexity with verb type in Table 5 shows that existentials, HAVE GOT and BE rarely co-occur with auxiliaries in the envelope of variation.13 Thus, any effect of additional auxiliaries cannot be established for these verb types. The results for HAVE reveal a strong preference for no-negation when the verb is simple, but a preference for not-negation when there are additional auxiliaries. The results for DO and lexical verbs further corroborate this interpretation, since no-negation is more frequent in simple constructions compared to those with auxiliaries.

Not-negation No-negation Negative concord Total N

% N % N % N

Existentials
Simple verb 4.2% 6 95.8% 138 0% 0 144
With non-modal auxiliary 0 1 0 1
HAVE
Simple verb 20.8% 11 77.4% 41 1.9% 1 53
With non-modal auxiliary (55.5%) 5 (44.4%) 4 (0%) 0 9
With modal/semi-modal (75%) 6 (25%) 2 (0%) 0 8
HAVE GOT
Simple verb14 15.3% 11 79.2% 57 5.6% 4 72
BE
Simple verb 19.2% 5 76.9% 20 3.8% 1 26
With non-modal auxiliary
With modal/semi-modal (50%) 1 (50%) 1 (0%) 0 2
DO
Simple verb 58.3% 14 29.2% 7 12.5% 3 24
With non-modal auxiliary 66.7% 8 8.3% 1 25% 3 12
With modal/semi-modal 80% 8 0% 0 20% 2 10
Lexical verbs
Simple verb 54.4% 37 25% 17 20.6% 14 68
With non-modal auxiliary 75% 21 17.9% 5 7.1% 2 28
With modal/semi-modal 75% 30 0% 0 25% 10 40

Table 5

Distribution of variants according to the complexity of the verb structure.

These results for no-negation are consistent with the hypotheses generated from both Account 1 and Account 2. In constructions with auxiliary verbs, the main verb necessarily resides in VP and thus can disrupt Agree (Account 1) or constitute extra material that the DP-internal negation must raise over to reach NegP (Account 2). Under Account 1, since negative concord is derived by the same mechanism, the variant is hypothesised to be dispreferred in constructions with auxiliary verbs. In contrast, Account 2 does not make such a prediction. Negative concord is more frequent with DO when auxiliaries are present, but its frequency amongst other lexical verbs (with and without auxiliaries) is more varied. Data is sparse between the cells in Table 5 for negative concord and thus its distribution here does not conclusively support one account over the other.

6.3 Discourse status

Figure 4 displays a statistically significant distribution of variants according to discourse status (χ2 = 26.80; d.f. = 2; p < 0.001), where the propensity to use no-negation is greater when introducing discourse-new information than in relation to a discourse-old proposition. In parallel, the relative frequency of not-negation is higher in discourse-old as opposed to discourse-new contexts. These results corroborate Wallage’s (2015; 2017) findings from both the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) and the British National Corpus (BNC). The frequency of negative concord in Figure 4 is only slightly higher in discourse-old contexts.

Figure 4 

Distribution of variants according to the discourse status of the proposition.

Considering the functional specialism of no-negation to express discourse-new information and general linguistic tendencies for new information to be introduced in post-verbal position (Ward and Birner 2008), these findings are most compatible with Account 2 in which no-negation arises from the syntactic marking of negation in the post-verbal DP as opposed to a higher pre-verbal NegP.

As for the patterning of variants according to the five functions outlined in Section 5.3, Figure 5 reveals a significant distribution (χ2 = 22.59; d.f. = 6; p < 0.001) where not-negation is most frequently used to negate a discourse-old proposition or inference that was positive, i.e. in explicit denials and to cancel inferences. When reiterating an originally-negative proposition or inference, i.e. in repetitions and assertions of inferences, no-negation is more likely. In the spoken BNC, Wallage (2015; 2017) finds that cancellations have the highest rate of not-negation and repetitions have the lowest, though the percentage distinctions between them are small, and pale in comparison to the overarching discourse-old versus discourse-new effect.15

Figure 5 

Distribution of variants in discourse-old contexts according to specific functions.

In Figure 5, negative concord behaves like not-negation in exhibiting the highest frequencies in denials, followed by cancellations, assertions and repetitions. This parallelism between not-negation and negative concord supports Account 2, in which not-negation and negative concord have the same structure while no-negation differs. Similar trends have been observed in the BNC, where the distributions of not-negation and negative concord according to discourse function were not statistically distinct (Wallage 2017: 142). Furthermore, these observations from Figure 5 suggest that not-negation and negative concord may be associated with marking focus, whereby negation c-commands the focused element (the verb) to indicate contrast with what was previously said or implied (see Jackendoff 1972; Wallage 2017: 141).16

As demonstrated in Section 6.1, verb type also affects the choice of variant, leading one to wonder whether these discourse status effects reflect semantic properties of the verbs. Cross-tabulating discourse status with verb type, as in Figure 6, shows that this is not the case: discourse status and verb type have independent effects. Tottie (1991b) observed a similar effect of discourse status on the choice of not-/no-negation in existentials, but my data shows that this holds for all six verb types. Within every verb category, no-negation is more frequent in expressing discourse-new information as opposed to denying or asserting a discourse-old proposition or inference, while the reverse is true for not-negation. The fact that existentials appear in both discourse-old and discourse-new environments in my data runs contrary to claims that existentials categorically introduce new information (Ward and Birner 2008: 164), but is in-keeping with the idea that existentials introduce new referents, which are either completely new or already known but brought to speakers’ attention again (Cruschina 2011: 73). The five tokens of not-negation with existentials all occur in discourse-old contexts, as do the six tokens of not-negation with BE, reiterating the association between discourse-old environments and not-negation.

Figure 6 

Distribution of variants according to verb type and discourse status.

Wallage (2013: 6) notes that repetitions may tend to feature the same variant that was used to express the original proposition, which is true in my data. Among repetitions produced by speakers who used more than one variant, when the original expression of a proposition has not-negation or no-negation, the repetition of that proposition features the same variant over 70% of the time. Nevertheless, when “repetitions” are excluded from the discourse-old category, the overall trends in Figure 6 are maintained.

The discourse status effect holds cross-dialectally, with Glasgow, Tyneside and Salford all displaying higher rates of no-negation in discourse-new contexts, and lower rates of not-negation in discourse-old contexts, as Figure 7 shows. The data for negative concord becomes sparser when divided in this way so one cannot draw firm conclusions about its distribution here.

Figure 7 

Distribution of variants according to discourse status, per locality.

As the results so far have demonstrated, several factors affect the choice of not-negation, no-negation and negative concord in Glasgow, Tyneside and Salford: verb type, complexity of the verb structure, and discourse status. The results for verb type and discourse status have more strongly supported the syntactic derivation of variants according to Account 2 over Account 1. Under Account 2, no-negation is derived via negative-marking within the post-verbal DP followed by movement to NegP to receive sentential scope, whereas in Account 1 it arises due to agreement between a covert negative operator in the sentential NegP and the indefinite. However, it is important to establish the relative impact of these factors when they are considered simultaneously, as pursued in Section 6.4.

6.4 Mixed-effects logistic regression

To further test the hypotheses from Accounts 1 and 2, I now conduct mixed-effects logistic regression using the lme4 package (Bates et al. 2015) in R (R Core Team 2014), analysing the following factors: verb type, discourse status, locality and speaker (random). Complexity of the verb structure is excluded because most tokens are with lexical verbs and running a regression with contexts where there is little to no variation would bias the model (Guy 1993: 239). For the same reason, the regression includes only speakers who were variable.17 Other excluded categories, with reasons in brackets, are as follows: existentials (near-categorical no-negation), BE (low frequency per locality) and repetitions (tend to take the same variant used in the expression of the original proposition – see Section 6.3).18DO and lexical verbs are combined as “lexical verbs” since they patterned similarly in the distributional analysis.

212 tokens remain for the regression: 96 not-negation, 86 no-negation and 30 negative concord.19 The comparatively lower frequency of negative concord meant that when its frequency was cross-tabulated with the independent variables, there were some sparsely populated cells. However, it is desirable to compare the results of a model that has negative concord as the application value with the other two models, to be comprehensive. For these reasons, the regression model for negative concord is less complex than the other two, in that (i) HAVE and HAVE GOT, shown to pattern alike, are combined, and (ii) locality is no longer included as a factor, since differences in the frequency of negative concord per locality were negligible (see Table 6). These decisions enable the investigation of the linguistic factors (verb type and discourse status) and the random effect of speaker on the use of negative concord.

Not-negation No-negation Negative concord

Total N 212 212 212
R2 value20 0.456 0.536 0.038

Estimate Std. error p Sig. % N Estimate Std. error p Sig. % N Estimate Std. error p Sig. % N

(Intercept) –1.2056 0.5169 0.0197 * 0.9401 0.4542 0.0385 * –3.5669 0.6946 2.82e–0.7 ***
Verb type
Ref.: HAVE 13.5 52 78.8 52 Ref.: HAVE (GOT) 5.4 93
HAVE GOT –0.9461 0.5992 0.1144 31.7 41 0.3496 0.5104 0.4934 65.9 41
Lexical verbs 1.5300 0.4700 0.0011 ** 63.9 119 –2.7084 0.4767 1.33e–0.8 *** 15.1 119 1.6549 0.5727 0.0039 ** 21 119
Discourse status
Ref.: Disc.-new 31.1 106 55.7 106 13.2 106
Discourse-old 1.0603 0.3621 0.0034 ** 59.4 106 –1.3087 0.3858 6.93e–0.4 *** 25.5 106 –0.1796 0.4913 0.7143 15.1 106
Locality
Ref.: Glasgow 49.2 63 34.9 63 15.9 63
Tyneside –0.9274 0.5165 0.0726 . 28.4 74 1.5184 0.4869 0.0018 ** 59.5 74 N/A 12.2 74
Salford 0.2522 0.5074 0.6191 58.7 75 –0.0830 0.4828 0.8635 26.7 75 N/A 14.7 75
Speaker (random)
Standard deviation 0.7913 0 1.4

Table 6

Three mixed-effects logistic regression analyses of factors contributing to the choice of not-negation, no-negation and negative concord.

Table 6 shows the results of the three mixed-effects logistic regression analyses to establish the influence of factors on the choice of (i) not-negation over no-negation and negative concord; (ii) no-negation over the other two variants; (iii) negative concord over the other two variants.

As Table 6 shows, verb type has the largest impact on the variation between not-negation, no-negation and negative concord. In all three models, there is a significant distinction between lexical and other verb types. Verb type also exhibits the largest range between the estimates for each level of any factor in the model. The prediction from Account 1 was that lexical verbs would disfavour no-negation and negative concord, since the position of the verb (in the VP) would interfere with the Agree relation required for those two variants. The results from Table 6 contradict this hypothesis. Although not-negation is favoured and no-negation is disfavoured with lexical verbs as expected, contrary to expectations we see that negative concord is favoured with lexical verbs. Under Account 2, no-negation is expected to be disfavoured with lexical verbs because these verbs add to the cost of moving negation out of the object DP to NegP, while negative concord involves Agree and not-negation does not involve agreement or movement of the negator. Account 2 makes no such prediction about the similarity between no-negation and negative concord. The results in Table 6 are therefore more compatible with Account 2 than Account 1. The fact that both not-negation and negative concord involve the marking of syntactic negation in the sentential NegP likely explains their similar distribution here. Furthermore, this is consistent with Wallage’s (2017: 198) conclusion regarding the historical relationship between the two: “[n]egative doubling with not is the antecedent of PDE [Present-Day English] not-negation”. In contrast, he argues that negative spread, i.e. concord between two or more n-words, is the antecedent of no-negation (Wallage 2017: 198). HAVE and HAVE GOT, as shown in the two models where they could be included separately, are not statistically distinguished. The tendency for HAVE GOT to occur with no-negation is contrary to the predictions of both Accounts 1 and 2 if we assume that HAVE is an auxiliary and GOT is a main verb. As mentioned earlier, this may reflect the unusual status of HAVE GOT as a semi-grammaticalised functional verb (Quinn 2000).

Discourse status patterns in complementary distribution between the first two runs: no-negation is significantly favoured in discourse-new contexts while not-negation is significantly favoured in discourse-old contexts. As already noted, the propensity for no-negation to mark discourse-new information is consistent with Account 2, according to which its distribution reflects a general tendency for new information to be introduced post-verbally (Ward and Birner 2008). As for negative concord, there is no significant difference in its frequency between discourse-old and discourse-new contexts.

There is also a significant effect of locality in the two models where this could be tested. Tyneside is statistically distinct from the other two communities in terms of its frequency of no-negation, but not with respect to not-negation. Glasgow and Salford are not statistically distinguished in the results of either run. These findings coincide with expectations if no-negation differs structurally from not-negation and negative concord combined, lending additional support to Account 2 over Account 1. Further evidence for this interpretation is that the no-negation run generated stronger levels of significance for all three fixed factors than the not-negation run, i.e. there is a greater statistical differentiation between no and the other variants than between not and the other variants, which reflects a structural difference.

7 Conclusion

This paper set out to integrate formal syntactic theory into a comparative variationist analysis of not-negation, no-negation and negative concord in Glasgow, Tyneside and Salford English. The investigation tested two accounts of the structure and derivation of the variants to assess which best captures the constraints on negation with indefinites as used in speech. Account 1 extended Zeijlstra’s (2004) Agree theory of negative concord to apply to all three variants such that (i) not-negation contains a negative marker in NegP with [iNEG]; (ii) no-negation arises due to Agree between a covert negative operator in NegP that has [iNEG] and a post-verbal indefinite DP with [uNEG]; and (iii) negative concord is the result of Agree between the negative marker with [iNEG] and indefinite DPs with [uNEG]. Under Account 2, not-negation and negative concord are derived in the same way as in Account 1, but no-negation is instead the result of negative-marking within the DP which subsequently moves to the higher NegP for sentential scope (based on Kayne 1998; Svenonius 2002; Zeijlstra 2011).

Several variationist sociolinguistic studies have analysed not-negation and no-negation as a binary variable (Tottie 1991a; b; Harvey 2013; Varela Pérez 2014; Burnett and Tagliamonte 2016a; b; Burnett et al. under review). These studies found that BE/HAVE tend to take no-negation and lexical verbs tend to take not which Tottie (1991a; b) and Varela Pérez (2014) attribute to the higher frequency of BE/HAVE making these verbs resistant to change and thus conserving no as the older variant. Others suggested that this effect could arise due to structural adjacency between negation and the indefinite (Harvey 2013; Burnett and Tagliamonte 2016a; b; Burnett et al. under review). Accounts 1 and 2 can capture these latter observations, though any consideration of only not-/no-negation could not provide evidence in favour of Account 1 or 2 over the other, since they make the same predictions about the behaviour of these two variants. Crucially, my inclusion of negative concord as a third variant, on syntactic and semantic grounds, enabled direct testing of the two theories. Under Account 1, because the main verb resides between NegP and the indefinite item, it may interfere in the Agree relation required for these two variants, given that more complex structures and additional material between operators and targets promote non-agreement more generally (Pietsch 2005: 129; Corbett 2006: 235–236; Buchstaller et al. 2013; Childs 2013). As such, it was hypothesised that no-negation and negative concord would be disfavoured with lexical verbs and constructions with auxiliaries. In the same contexts, Account 2 predicts that only no-negation would be disfavoured, because only the DP-internal no-negation must move over the intervening verb to NegP to receive sentential scope – indeed this is what was found.

The quantitative analyses of spoken corpus data in this paper demonstrated that not-negation and negative concord behave alike with respect to frequency (the higher the rate of not, the higher the rate of concord) and verb type (both are favoured with lexical as opposed to functional verbs). These lines of evidence contradict Account 1 and more strongly support Account 2 of the variation, in which English n-words are marked syntactically for negation DP-internally in cases of no-negation (as well as in pre-verbal position and fragment answers), but not in negative concord (see also Tubau 2016). An additional, independent effect relates to discourse status. While not-negation is favoured with negative expressions relating to a discourse-old proposition, no-negation is favoured when contributing discourse-new information (see also Wallage 2013; 2015; 2017). This effect might at first seem to be outside the syntax, but it is actually consistent with Account 2 in which no-negation is the only variant that is marked within the post-verbal DP. The post-verbal position, where no-negation is marked, is indeed associated with the introduction of new information to the discourse more generally (Ward and Birner 2008). Discourse status was not significant overall for negative concord in the regression analysis, though the distributional analysis showed that both not-negation and negative concord tend to be used for the same sub-functions when ranked, revealing further similarities in their distribution.

As this paper has demonstrated, integrating formal syntactic theory into a quantitative variationist analysis of morpho-syntactic variation in speech assists in defining the confines of the variable and its context of application, deciding what to extract and exclude from corpus data, and formulating and testing hypotheses to evaluate different theoretical accounts of the variability. Probabilistic data from language production has proven to be a rich testing ground for establishing the robustness of competing syntactic theories, both within and across language varieties, providing new insights into the structural relationships between different variants.