1 Introduction

Grammont’s influential Law of Three Consonants (LTC) states that French schwa is obligatorily pronounced in any CC_C sequence (where C stands for any consonant) to avoid three-consonant clusters (Grammont 1914). In later works, schwa presence has been shown to be sensitive to the nature of the consonants involved in the CC_C sequence, at least at the word and phrase levels. However the LTC is still generally considered as accurate under Grammont’s original formulation to describe schwa-zero alternations at the stem level (Dell 1978; Côté 2001). In particular, schwa is generally considered to be obligatorily pronounced in any CC_C sequence within monomorphemic words (e.g. [ʁɡ_ʁ] in marguerite [maʁɡǝʁit] ‘daisy’) or at the boundary between a stem and a derivational suffix (e.g. [ʁd_ʁ] in garde-rie [ɡaʁd-ə-ʁi] ‘Kindergarten’), regardless of the nature of the consonants involved.

The goal of this paper is to test whether the Law of Three Consonants should be relaxed even at the stem level, following Scheer’s (1999) insight that at least some speakers might allow for differential treatments of CC_C sequences in this context. This question is not only relevant to French phonology but also more widely to phonological theory. In phonetically based theories of phonology, phonotactic asymmetries ultimately reflect perceptual or articulatory asymmetries (Ohala 1990; 1992; Steriade 1997; Flemming 2002; Storme 2019). Phonetic explanations have been put forth to account for the role of the consonantal context in French schwa-zero alternations in particular (Côté 2001). Under the default assumption that segmental properties are largely independent from their morphosyntactic context, these phonetically based analyses predict that the same phonotactic asymmetries should be observed across morphosyntactic domains. If French schwa-zero alternations are sensitive to the nature of surrounding consonants at word and phrase levels but never at the stem level, this is potentially problematic for the hypothesis that phonotactic restrictions are phonetically grounded.

The paper presents two studies using judgment data to test whether only cluster size matters at the stem level or whether the segmental make-up of the cluster is also relevant, as reported for word and phrase levels. The two studies focus on the behavior of schwa-zero alternations at the boundary between stems and derivational suffixes (stem-level phonology) and at the boundary between stems and inflectional suffixes (word-level phonology). Each study controls for different effects that could influence schwa-zero alternations beyond the properties of consonant clusters. Study 1 controls for dialectal effects, comparing data from a variety that is more prone to delete schwa (Swiss French) and a variety that is less prone to delete schwa (French from France). Study 2 controls for the effect of stem length, comparing schwa realization in monosyllabic stems and in disyllabic stems. The data and code are available on OSF (https://osf.io/5hvxs/).

Section 2 provides some background on schwa-zero alternations in French, with a special focus on the Law of Three Consonants, and introduces the hypotheses to be tested in the paper. Section 3 presents Study 1. Section 4 presents Study 2. Section 5 implements the two concurrent analyses of stem-level and word-level phonologies (same vs. different phonotactic constraints across levels) as probabilistic constraint-based grammars and compares their fit to the judgment data collected in Study 1 and Study 2, thus providing a theoretically motivated and quantitative assessment of the two analyses. A constraint-based analysis is used because, as noted by Durand & Laks (2000: 32), constraints provide a very intuitive interpretation of the Law of Three Consonants as caused by a general markedness constraint *CCC banning three-consonant clusters. Also several recent theoretical papers have modeled schwa-zero alternations using probabilistic constraint-based grammars (Bayles et al. 2016; Smith & Pater 2020).

2 Background and hypotheses

2.1 The Law of Three Consonants

In French, some morphemes alternate between a form with schwa and a form without schwa. For instance, the noun demande ‘request’ can be realized with a schwa as [dəmãd] or without schwa as [dmãd]. Determining the factors that condition the distribution of schwa-zero alternations has been a central topic in French phonology for more than a century. Table 1 provides a non-exhaustive list of the variables that have been reported to play a role in this alternation along with a non-exhaustive list of the sources that document these effects. This table builds largely but not exclusively on Bürki et al. (2011: 3982–3985).

Table 1

A non-exhaustive list of variables reported to condition schwa-zero alternations in French.

Variables Source
Segmental variables Number of surrounding consonants Bürki et al. (2011)
Nature of surrounding consonants Côté (2001); Bürki et al. (2011)
Morphological variables Grammatical function of following suffix Dell (1978); Côté (2001)
Prosodic variables Position in word Bürki et al. (2011)
Position with respect to stress Smith & Pater (2020)
Position wrt prosodic boundaries Dell (1977); Côté (2001)
Word position in utterance Bürki et al. (2011)
Size of prosodic constituent Côté (2007)
Speech rate Malécot (1976); Bürki et al. (2011)
Lexical variables Word frequency Eychenne (2019)
Word length Léon (1971)
Word identity Bürki et al. (2011)
Speaker variables French variety Gess et al. (2012)
Speaker identity Bürki et al. (2011)

Among these variables, the consonantal context, and in particular the number of consonants surrounding schwa, has received particular attention early on. In his influential treaty on French pronunciation, Grammont (1914: 115–116) states that a preconsonantal schwa is obligatory when preceded by two consonants (CC_C), as illustrated in (1a), but excluded when preceded by a single consonant (C_C), as illustrated in (1b). He calls this generalization the ‘loi des trois consonnes’ (Law of Three Consonants; LTC) and explains it as a strategy to avoid three-consonant clusters. In (1a), the schwa form is preferred because it makes it possible to avoid the three-consonant cluster [tdm]. In (1b), the schwa-less form is preferred in the absence of three-consonant clusters.

(1) Grammont’s Law of Three Consonants (LTC)
  a. Schwa is obligatory in CC_C
    C#C_C [tdm] sept demandes [sɛtdəmãd] ‘seven requests’
  b. Schwa is excluded in C_C
    C_C [dm] la demande [ladmãd] ‘the request’

2.2 Not only size matters, at least at the word and phrase levels

Subsequent works on French schwa have provided a more nuanced view of the LTC. First, the LTC has been found to hold as a gradient rather than a categorical generalization: schwa is not obligatory in CC_C and excluded in C_C but more likely in CC_C overall than in C_C (Bürki et al. 2011; Racine & Andreassen 2012; Côté 2012; Hambye & Simon 2012; Hansen 2012). Second, not only the number but also the nature and order of surrounding consonants has been found to be relevant. C_C sequences with increasing sonority favor the schwa-less form (Bürki et al. 2011). In CC_C, schwa is more likely to be pronounced if the middle consonant is a stop than if it is a fricative (see Côté 2001: 119 and earlier references therein), as illustrated in (2). Also, schwa is more likely to be pronounced if its absence implies that an obstruent-liquid cluster (OL) is not directly followed by a vowel (Dell 1976; 1985; Côté 2001), as illustrated in (3).

(2) CS_C and CF_C sequences behave differently (S=stop, F=fricative) (phrase level)
  a. Schwa is more likely in C#S_C
    C#S_C [tdm] sept demandes [sɛt#d(ə)mãd] ‘seven requests’
  b. Schwa is less likely in C#F_C
    C#F_C [tfn] sept fenêtres [sɛt#f(ə)nɛtʁ] ‘seven windows’
(3) OL_C and CO_L sequences behave differently (O=obstruent, L=liquid) (phrase level)
  a. Schwa is more likely in O#L_C
    O#L_C [kls] chaque leçon [ʃak#l(ə)sɔ̃] ‘each lesson’
  b. Schwa is less likely in C#O_L
    C#O_L [spl] douce pelouse [dus#p(ə)luz] ‘sweet lawn’

This nuanced view of the LTC has been argued to be relevant at the phrase level (e.g. when the three-consonant cluster spans a word boundary, as illustrated in (2) and (3)) and at the word level (e.g. when the three-consonant cluster spans a boundary between a stem and an inflectional suffix). This latter case is illustrated in (4) with the inflectional future/conditional suffix -r- [ʁ]. At the boundary between a stem and this inflectional suffix, schwa is more likely to be pronounced if its absence implies that an obstruent-liquid cluster (OL) is not directly followed by a vowel (Côté 2001: 85). The situation is analogous to what happens at the phrase level (see (3)).

(4) OL_C and CO_L sequences behave differently (O=obstruent, L=liquid) (word level)
  a. Schwa is more likely in OL_-Cinflection
    OL_-Cinflection [ɡlʁ] règlera [ʁɛɡl(ə)-ʁa] ‘adjust-fut.3sg
  b. Schwa is less likely in CO_-Linflection
    CO_-Linflection [ʁdʁ] gardera [ɡaʁd(ə)-ʁa] ‘keep-fut.3sg

However, at the stem level, the LTC is still generally considered to hold as a categorical generalization in line with Grammont’s strict interpretation in (1). Three-consonant sequences (CC_C) at the stem level include sequences that are contained within a single morpheme (e.g. [ʁg_ʁ] in marguerite ‘daisy’) or span a boundary between a stem and a derivational suffix (e.g. [ʁd_-ʁderivational] in garde-rie ‘Kindergarten’). In this context, schwa is generally reported to be categorically pronounced and this regardless of the nature and order of surrounding consonants (Dell 1978; Côté 2001: 85, 109; Côté 2012: 258–259; but Scheer 1999: 90 with further discussion in section 2.4). For instance, schwa is reported by Dell and Côté to be obligatory in both OL_C and CO_L at the boundary between stems and derivational suffixes, as illustrated in (5a) and (5b). This contrasts with the differential treatment of OL_C and CO_L at the phrase and word levels illustrated in (3) and (4).

(5) OL_C and CO_L clusters are reported to behave identically at the stem level
  a. Schwa is reported to be obligatory in OL_-Cderivation
    OL_-Cderivation [ɡlm] règlement [ʁɛɡlə-mã] ‘regulation’
  b. Schwa is reported to be obligatory in CO_-Lderivation
    CO_-Lderivation [ʁdʁ] garderie [ɡaʁdə-ʁi] ‘Kindergarten’

2.3 The hypothesis of strongly distinct phonologies across domains

The common view according to which the nature of consonants involved in CC_C sequences is relevant for schwa-zero alternations in words and phrases but not in stems implies that the phonological grammar may differ quite substantially across morphosyntactic domains. At the word and phrase levels, a set of phonotactic constraints referencing different types of three-consonant clusters (e.g. *OLC, *COL, etc.) would be active, resulting in different patterns of schwa-zero alternations for different types of three-consonant clusters, as illustrated in (2), (3), and (4). At the stem level, only a single phonotactic constraint banning three-consonant clusters would be active (*CCC),1 resulting in a single pattern of schwa-zero alternations for all CC_C sequences, as illustrated in (5).

This paper will test the hypothesis of strongly distinct phonologies across morphosyntactic domains for French by focusing on two specific three-consonant sequences (obstruent-liquid-consonant sequences and liquid-obstruent-liquid sequences) and two specific morphosyntactic contexts (derivation and inflection). Obstruent-liquid-consonant sequences (OL_C) and liquid-obstruent-liquid sequences (LO_L) were chosen because they behave differently in some morphosyntactic domains (see (3) and (4)). Derivational suffixes (as illustrated in (5)) and inflectional suffixes (as illustrated in (4)) were chosen as examples of stem-level and word-level domains, respectively, because both involve suffixation and therefore form a minimal pair for the stem-level vs. word-level distinction. Derivational suffixes attach to stems to form complex stems, as illustrated in (6a). Inflectional suffixes attach to stems to form words, as illustrated in (6b) (Dell 1978: 7–8).

(6) a. Derivation as a stem-formation process: [[ɡarde]stem-rie]stem ‘Kindergarten’
  b. Inflection as a word-formation process: [[ɡarde]stem-ra]word ‘keep-fut.3sg

The predictions of the hypothesis of strongly distinct phonologies across domains are summarized in (7) for the two relevant consonant sequences (OL_C and LO_L) and the two relevant morphological contexts (derivation and inflection). The predictions are summarized in (7a) at the level of the data (probability distribution of schwa-zero alternations) and in (7b) at the level of the grammar (constraint set in each stratum).

(7) The hypothesis of strongly distinct phonologies across stem and word levels
  a. Data: schwa-zero alternations are sensitive to the nature of consonants at the word level (inflection) but not at the stem level (derivation)
    P(ə|OL_-Cinflection) ≠ P(ə|LO_-Linflection)
    P(ə|OL_-Cderivation) = P(ə|LO_-Lderivation)
  b. Grammar: phonotactic constraints differ at the stem level (derivation) and at the word level (inflection)
    {*OLCinflection, *LOLinflection}
    {*CCCderivation}

In (7a), P(ə|context) refers to the conditional probability of schwa presence given a particular morphophonological context. According to the hypothesis of strongly distinct phonologies across stem and word levels, this probability varies depending on the nature of the consonants involved in the CC_C sequence at the word level (inflection) but it is constant for all CC_C sequences at the stem level (derivation).

The main features of a phonological grammar that can derive these asymmetries are summarized in (7b). Phonotactic constraints are indexed to specific morphosyntactic domains. Constraint indexation is one of the ways to derive morpheme-specific phonology (Pater 2007). At the stem level, a single *CCCderivation constraint penalizes any three-consonant cluster that spans a boundary between a stem and a derivational suffix. Hence it penalizes equally OLC clusters (e.g. [ɡlm] in règle-ment [ʁɛɡl-mã] ‘regulation’ in (5a)) and LOL clusters (e.g. [ʁdʁ] in garde-rie [ɡaʁd-ʁi] ‘Kindergarten’ in (5b)). At the word level, different consonant clusters are penalized by different constraints. *OLCinflection penalizes an obstruent-liquid-consonant cluster that spans a boundary between a stem and an inflectional suffix (e.g. [ɡlʁ] in règle-ra [ʁɛɡl-ʁa] ‘adjust-fut.3sg’; see (4a)). *LOLinflection penalizes a liquid-obstruent-liquid cluster that spans a boundary between a stem and an inflectional suffix (e.g. [ʁdʁ] in garde-ra [ɡaʁd-ʁa] ‘keep.fut.3sg’; see (4b)).

Table 2 shows how a grammar with the properties in (7b) can in principle derive the type of probability distribution hypothesized in (7a) for schwa-zero alternations. Each of the four relevant morphophonological contexts are shown in the first column. The second column shows the schwa and schwa-less variants for each context. Columns 3 to 6 show how each variant is evaluated by the four constraints in the analysis (1 indicates that the candidate in the corresponding row violates the constraint in the corresponding column). In addition to the three constraints in (7b), a faithfulness constraint Dep(ə) penalizing the epenthesis of schwa was added. This analysis assumes that the schwa-less variant is the underlying form and the schwa variant is derived through epenthesis. This is the classic analysis of French schwa at morpheme boundaries (Dell 1985).2 To materialize the relative importance of constraints, weights (w) are used. The constraint weights in Table 2 were chosen for illustration purposes only. The harmony of a candidate (column 7) corresponds to the weighted sum of its constraint violations, as in Harmonic Grammar (Smolensky & Legendre 2006). Probabilities (last column) are calculated from those harmonies using the MaxEnt framework (Goldwater & Johnson 2003; Hayes & Wilson 2008).

Table 2

Illustration of the analysis assuming strongly distinct phonologies at the stem level (derivation) and at the word level (inflection). O = obstruent, L = liquid, C = consonant.

Inputs Outputs *CCCderivation
w = 7
*OLCinflection
w = 2
*LOLinflection
w = 1
Dep(ə)
w = 1
Harmony Prob.
OL-Cderiv OLəC 1 1 1.00
OLC 1 7 0
LO-Lderiv LOəL 1 1 1.00
LOL 1 7 0
OL-Cinfl OLəC 1 1 0.73
OLC 1 2 0.27
LO-Linfl LOəL 1 1 0.50
LOL 1 1 0.50

The probabilities in Table 2 follow the predictions in (7a): the probability of using the schwa form (in bold characters) is constant at the stem level (derivation) regardless of the consonants involved in the CC_C sequence whereas this probability varies depending on the nature of the consonants involved at the word level (inflection). Moreover, if the weight of the constraint banning three-consonant clusters at the stem level (derivation) is sufficiently high relative to the weight of the faithfulness constraint penalizing schwa (here the ratio of the corresponding weights is 7/1), then it is possible to derive a near categorical behavior for schwa-zero alternations in this context, with schwa being nearly obligatory (the probability is equal to 1.00 due to rounding). More variability can be derived at the word level (inflection) if the weights for the relevant constraints are not as far apart (here the ratios for the corresponding weights are 2/1 and 1/1).

The hypothesis of strongly distinct phonologies across stem and word levels seems to be generally assumed in the literature, at least by Dell (1985) and Côté (2001). One exception is Scheer (1999: 90), who reports differential treatments for three-consonant clusters even in stems for some speakers. However he does not provide direct empirical evidence for this claim. To the author’s knowledge, the only study which tested the Law of Three Consonants in stems is Côté (2012): in a corpus study of Quebec French, she found no exception to the LTC under its categorical version (Côté 2012: 258). However, as will be further discussed in section 3.1.1, the corpus used in this study is probably too small to draw strong conclusions regarding the categorical nature of the LTC.

2.4 The hypothesis of weakly distinct phonologies across domains

Alternatively, schwa-zero alternations could be gradient and sensitive to the nature of consonants across levels but closer to categorical at the stem level, at least for some speakers (e.g. Scheer 1999). According to this view, the same phonotactic constraints against various types of three-consonant clusters would be active across levels but their effects would be potentially more subtle at the stem level. The predictions of this hypothesis are summarized in (8), with (8a) focusing on the predictions at the level of the data and (8b) on the predictions at the level of the grammar. The grammar in (8b) uses the same phonotactic constraints at the stem and word levels (contrary to the grammar in (7b)). This means that the probability of the schwa form can vary by consonant cluster both at the stem and word levels, as shown in (8a). But it does not need to vary at both levels if the relevant constraints have the same weights for some speakers.

(8) The hypothesis of weakly distinct phonologies across word and stem levels
  a. Data: schwa-zero alternations may be sensitive to the nature of consonants both at the word level (inflection) and stem level (derivation)
    P(ə|OL_-Cinflection) ≠ P(ə|LO_-Linflection)
    P(ə|OL_-Cderivation) ≠ P(ə|LO_-Lderivation) for some speakers
  b. Grammar: phonotactic constraints are the same at the word level (inflection) and stem level (derivation)
    {*OLCderivation, *LOLderivation}
    {*OLCinflection, *LOLinflection}

Table 3 shows how a grammar with the properties in (8b) and an additional Dep(ə) constraint can predict differential treatments of different CC_C sequences at both word and stem levels, as hypothesized in (8a).

Table 3

Illustration of the analysis assuming weakly distinct phonologies at the stem level (derivation) and at the word level (inflection). O = obstruent, L = liquid, C = consonant.

Inputs Outputs *OLCder
w = 5
*LOLder
w = 4
*OLCinf
w = 2
*LOLinf
w = 1
Dep(ə)
w = 1
Harmony Prob.
OL-Cderiv OLəC 1 1 0.98
OLC 1 5 0.02
LO-Lderiv LOəL 1 1 0.95
LOL 1 4 0.05
OL-Cinfl OLəC 1 1 0.73
OLC 1 2 0.27
LO-Linfl LOəL 1 1 0.50
LOL 1 1 0.50

The probabilities in Table 3 follow the predictions in (8a): the probability of using the schwa form (in bold characters) varies both at the stem level (derivation) and at the word level (inflection). Moreover, if the weights of the constraint banning OLC clusters (*OLC) and LOL clusters (*LOL) at the stem level are sufficiently high relative to the weight of the faithfulness constraint penalizing schwa (here the ratios of the corresponding weights are 5/1 and 4/1, respectively), then it is possible to derive near categorical behaviors for schwa-zero alternations in these contexts, with schwa being nearly obligatory (probabilities are equal to 0.98 and 0.95, respectively). However the two morphophonological contexts (OL-Cderivation, LO-Lderivation) are still predicted to behave slightly differently with respect to schwa-zero alternations (the probabilities of the schwa form are distinct although close; 0.98 vs. 0.95), due to the weights of the two relevant phonotactic constraints being different (5 vs. 4). Moreover, the effects of phonotactic constraints are predicted to be more salient at the word level (inflection) if their weights are not as large relatively to the weight of the faithfulness constraint. Here the ratios for the corresponding weights are 2/1 and 1/1, to be compared to the larger ratios at the stem level (5/1 and 4/1).

A grammar with the property in (8b) can also derive uniform and near-categorical treatments for OL_C and LO_L sequences at the stem level if the two relevant constraints (*OLCderivation and *LOLderivation) have the same weights and these weights are high (the resulting grammar then behaves like the grammar in Table 2).

In other words, this analysis shows that it is theoretically possible to predict more subtle and closer to categorical effects of phonotactic constraints at the stem level than at the word level while still assuming the same phonotactic constraints in both morphosyntactic domains. This type of asymmetries in the effect of phonotactic constraints is actually predicted by current models of probabilistic constraint-based grammars such as MaxEnt. This type of models are indeed known to produce floor and ceiling effects, due to the sigmoid relationship they imply between harmony and probability (Zuraw & Hayes 2017: 506–510). Moreover, the same analysis can derive uniform treatments of CC_C sequences at the stem level if the relevant constraints have the same weight for some speakers.

The difference between the hypotheses of strongly vs. weakly distinct phonologies across stem and word levels ultimately hinges on whether some French speakers allow for differential treatments of CC_C sequences at the stem level. Indeed, both hypotheses can derive uniform treatments of CC_C sequences at the stem level. But only the latter hypothesis (weakly distinct phonologies) can derive differential treatments of CC_C sequences at the stem level.

2.5 Implications for phonetically based theories of phonotactic constraints

The main goal of this paper is to tease apart the two versions of the stem-level vs. word-level divide presented in sections 2.3 and 2.4 for French. This question also has theoretical implications beyond French. As anticipated in Section 1, phonetically based theories of phonotactics hold that phonotactic asymmetries ultimately reflect perceptual and articulatory asymmetries (Ohala 1990; 1992; Steriade 1997; Flemming 2002; Storme 2019). According to these theories, the same phonotactic asymmetries should be reflected across the grammar’s morphosyntactic domains if the same phonetic asymmetries among segments hold across these domains. Under the default assumption that segmental properties are largely independent from morphosyntactic context, these theories of phonotactics are more directly compatible with the hypothesis of weakly distinct phonologies across domains in (8).

This point can be illustrated more concretely by considering explanations that have been proposed for asymmetries among three-consonant clusters in French. The phonotactic constraints that drive French schwa-zero alternations in three-consonant sequences have been argued to ultimately have phonetic motivations (Côté 2001: 137–152). For instance, Côté proposed that schwa is more likely to appear after a medial stop (CS_C) than after a medial fricative (CF_C), as illustrated in (2), because stops have weaker perceptual cues than fricatives to signal place of articulation (Wright 2004; Jun 2004) and therefore are more in need of vocalic support. She also proposed that schwa is more likely to occur in OL_C than in CO_L, as illustrated in (3) and (4), because, in the absence of schwa, OL_C features a local sonority peak (the medial liquid) that does not correspond to a syllable peak (French does not allow syllabic liquids), in violation of the sonority sequencing principle (Clements 1990). This problem is solved if schwa is pronounced in OL_C. By contrast, CO_L already satisfies the sonority sequencing principle in the absence of schwa and therefore schwa presence is not as crucial. In both cases (CF_C vs. CS_C and OL_C vs. CO_L), the explanation refers to phonetically based asymmetries among segments. Because these segmental asymmetries are expected to hold across the grammar’s morphosyntactic domains (e.g. a stop should have weaker internal cues than a fricative regardless of morphosyntactic context), then the same phonotactic asymmetries among clusters should be observed across strata, in line with the weak version of the divide between stem-level and word-level phonologies in (8).

2.6 Further hypotheses to be tested

In addition to the strong and weak versions of the stem-level vs. word-level divide, four further hypotheses will be tested in this paper.

2.6.1 Effects of morphosyntactic domains on schwa-zero alternations

First, as already discussed above, schwa is reported to be more likely to appear at the stem level (derivation) than at the word level (inflection) overall (Dell 1978; Côté 2001; 2012).3 The goal here will be to test whether this hypothesis holds across a range of consonant clusters. The hypothesis is summarized in (9a) at the level of the data and in (9b) at the level of the grammar.

(9) Hypothesis about the effect of stem level vs. word level on schwa-zero alternations
  a. Data: schwa is more likely to break any given consonant cluster at the stem level (derivation) than at the word level (inflection)
    P(ə|(C)C_-Cderivation) > P(ə| (C)C_-Cinflection)
  b. Grammar: a given consonant cluster is more penalized at the stem level (derivation) than at the word level (inflection)
    w(*(C)CCderivation) > w(*(C)CCinflection)

If the hypothesis that schwa is less likely at the word level (inflection) than at the stem level (derivation) was confirmed, this would confirm the general claim that schwa becomes less likely as the morphosyntactic domain widens. Evidence for asymmetries between lower and higher morphosyntactic domains is provided by Dell (1977) and Côté (2001) at the phrase level. For instance, schwa is more likely to occur in [st_d] between a noun and its complement within the same noun phrase, as illustrated in (10a), than at the boundary between two phrases, as illustrated in (10b) (Dell 1977: 151).

(10) a. Schwa is more likely between a noun (N) and a prepositional phrase (PP)
    Elle met [laD list([ə])N d’artistesPP]NP dans sa poche.
    ‘She puts the list of artists in her pocket.’
  b. Schwa is less likely between a noun phrase (NP) and a prepositional phrase (PP)
    Elle [met [la liste d’artist([ə])]NP [dans sa poche]PP]VP.
    ‘She puts the list of artists in her pocket.’

Côté (2001: 129–132) proposed a prosodic characterization for the kind of asymmetries illustrated in (10a) and (10b), with schwa being less likely at the edge of higher prosodic domains due to strengthening and lengthening effects. Strengthening and lengthening of consonants at the edge of high prosodic domains make the presence of schwa more superfluous for the sake of consonant identification (Côté 2001: 146–151). The prosodic analysis accounts for the asymmetry in (10a) and (10b) under the reasonable assumption that the schwa in (10a) occurs inside a phonological phrase (la liste d’artistes) whereas the schwa in (10b) occurs between two phonological phrases (la liste d’artistes and dans sa poche). However this analysis does not extend to the asymmetry between stem level and word level, as there is no prosodic boundary below the word level in French. If derivational suffixes are found to favor schwa presence more than inflectional suffixes at the stem-suffix boundary, as hypothesized in (9), this means that domain effects cannot be all reduced to prosody and that there are genuine effects of morphosyntactic domains on schwa-zero alternations, as originally formulated by Dell (1977).

2.6.2 Effects of consonant clusters on schwa-zero alternations

The second hypothesis to be tested concerns the relative markedness of different types of consonants clusters in French. As mentioned before, obstruent-liquid clusters are reported to be more strongly avoided before consonants than before vowels, due to the sonority sequencing principle (Dell 1976; Côté 2001). Moreover, three-consonant clusters are reported to be more strongly avoided than two-consonant clusters (Grammont 1914; Bürki et al. 2011; Smith & Pater 2020). In other words, this means that schwa should be more likely to occur in OL_C than in LO_L and in LO_L than in C_C. The predictions of this hypothesis on cluster markedness are summarized in (11a) at the level of the data and in (11b) at the level of the grammar.

(11) Hypothesis about the effect of cluster type on schwa-zero alternations
  a. Data: schwa is more likely to appear in OL_C than in LO_L, and in LO_L than in C_C
    P(ə|OL_C) > P(ə|LO_L) > P(ə|C_C)
  b. Grammar: OLC clusters are phonotactically more marked than LOL clusters and LOL clusters are more marked than two-consonant clusters
    w(*OLC) > w(*LOL) > w(*CC)

2.6.3 Effects of stem length on schwa-zero alternations

When testing the various effects mentioned above, the paper will also control for additional effects that could play a role in schwa-zero alternations, in particular the length of stems. Earlier studies have indeed shown that schwa is more likely to break a consonant cluster if schwa can avoid a stress clash, as illustrated in (12a), than in the absence of stress clash, as illustrated in (12b) (Dell 1985; Smith & Pater 2020). In French, the main stress falls on the last syllable of the word.

(12) a. Schwa is more likely to appear between two stressed syllables
    film russe ['film(ə)'ʁys] ‘Russian movie’
  b. Schwa is less likely to appear if only one adjacent syllable bears stress
    film danois ['film(ə)da'nwa] ‘Danish movie’

French word-initial syllables are usually assumed to bear a secondary stress, corresponding phonetically to a word-initial F0 jump (Vaissière 2002). When a monosyllabic stem bearing word-initial secondary stress is suffixed with a monosyllabic suffix bearing word-final primary stress (e.g. garde-ra [ˌɡaʁd-'ʁa] ‘keep-fut.3sg’), a stress clash is predicted to arise. Schwa can then be epenthesized to avoid this clash (e.g. garde-ra [ˌɡaʁdə-”ʁa] ‘keep-fut.3sg’). When the stem is disyllabic (e.g. regarde-ra [ˌʁəɡaʁd-'ʁa] ‘look-fut.3sg’), no stress clash is expected to arise and therefore there is no prosodic motivation for schwa epenthesis. Hence, schwa should be more likely to appear after monosyllabic stems than after disyllabic stems (assuming suffixes are always monosyllabic), as summarized in (13a). At the grammatical level, a phonotactic constraint is needed to penalize stress clash, as summarized in (13b).

(13) Hypothesis about the effect of stem length on schwa-zero alternations
  a. Data: schwa is more likely to appear after monosyllabic stems than after disyllabic stems
    P(ə|monosyllabic stem) > P(ə|disyllabic stem)
  b. Grammar: a phonotactic constraint penalizes two adjacent stress-bearing syllables
    *Clash

2.6.4 Dialectal effects on schwa-zero alternations

Finally, the paper will also control for dialectal effects on schwa-zero alternations by looking at two French varieties with different rates of schwa realization: French from France and Swiss French. Earlier studies have reported that French speakers from France are more likely to pronounce schwas than French speakers from Switzerland overall (Racine 2007; 2008; Racine 2016). The hypothesis is summarized in (14a) at the level of the data and in (14b) at the grammatical level.

(14) Hypothesis about dialectal effects on schwa-zero alternations
  a. Data: schwa is more likely to appear in French from France than in Swiss French
    P(ə|France) > P(ə|Switzerland)
  b. Grammar: the faithfulness constraint penalizing schwa epenthesis has a larger weight in Swiss French than in French from France
    w(Dep(ə)Switzerland) > w(Dep(ə)France)

3 Study 1

Study 1 tests the hypotheses of strongly vs. weakly distinct phonologies across stem and word levels while controlling for dialectal effects (France vs. Switzerland) on schwa-zero alternations. The methods are described in section 3.1. The results are presented in section 3.2 and discussed in section 3.3. The data (study1-data.RData) and R code (study1-code.R) are available on OSF. A preliminary version of this study was published in Storme (2021).

3.1 Methods

3.1.1 Judgment task

The present study uses speakers’ metalinguistic judgments as primary data, following a long tradition in linguistics (Schütze & Sprouse 2013; Schütze 2016; Myers 2017) and in the study of French schwa (Dell 1985; Côté 2001; Racine & Grosjean 2002; Racine 2007; Smith & Pater 2020). However, because many recent studies on French schwa are based on speech corpora instead of judgments (e.g. Bürki et al. 2011; Racine & Andreassen 2012; Côté 2012; Eychenne 2019), the use of judgment data may require some justification. First, corpus data suffer from a problem of data sparsity that is expected to be particularly acute in the present case. Indeed, the hypotheses tested in this study bear on words that feature very specific morphological and phonological properties, namely inflected and derived words with OL_C and LO_L clusters at stem-suffix boundaries. Corpora would probably need to be very large in order to feature enough occurrences of the relevant words. For instance, the corpus of Laurentian French used in Côté (2012) contain 2,530 contexts for schwa but only three instances of CC_C at the boundary between a stem and the inflectional future/conditional suffix -r- (Côté 2012: 258). Moreover, speakers might behave differently with respect to the treatment of CC_C sequences at the stem level, as discussed in section 2.4. Controlling for this type of individual variation requires having multiple occurrences of the relevant morphophonological contexts by speaker. The corpora usually used for phonological research in French (like the corpus Phonologie du Français Contemporain; PFC; Durand et al. 2009) are not large enough to allow for this level of granularity. Finally, to the author’s knowledge, available corpora of French speech do not provide the morphological information necessary to readily test the hypotheses that are central in the present study. In particular, the PFC corpus does not provide information about whether a word-internal schwa is morpheme-internal or at a morpheme boundary (see Racine et al. 2016 for a description of the variables that were coded for schwa in PFC).

The design of the present study was inspired by a previous study by Racine (2007; 2008) that used metalinguistic judgments to estimate the likelihood of schwa and schwa-less word variants in French (see also Racine & Grosjean 2002: 312–313). More specifically, participants were asked to rate how likely they would be to pronounce schwa variants and schwa-less variants for a set of 115 words. The task was slightly different from the task used by Racine. In the present study, the task corresponds to a judgment of relative frequency whereas participants in Racine’s study were asked to rate the absolute frequency of each variant independently. A judgment of relative frequency was used because it makes it possible to directly obtain the information most relevant for the research question of interest, namely the estimated relative frequency of the two variants. In Racine’s work, an extra-step is needed to calculate the relative frequency of the two variants from their individual frequencies (see Racine 2007: 127). Following Racine (2007), the judgments were elicited using a seven-point Likert scale, with 1 indicating a categorical preference for the schwa variant (e.g. garderie), 7 indicating a categorical preference for the schwa-less variant (e.g. gard’rie), and 4 indicating no preference for either form. An example is shown in Figure 1.

Figure 1
Figure 1

Judgment task.

3.1.2 Experimental items and fillers

Two variables were manipulated to construct the experimental items: Cluster (with three levels: OL_C, LO_L, C_C) and Morphology (with two levels: derivation, inflection). C_C stands for any two-consonant cluster, LO_L for liquid-obstruent-liquid clusters, and OL_C for obstruent-liquid-consonant cluster. Inflected words all featured the future suffix -r- because this suffix is to the author’s knowledge the only consonant-initial inflectional suffix in French. Inflected words were all presented with a subject pronoun preceding them (e.g. je chanterai ‘I will sing’) to ensure that they were correctly identified as inflected words. Derived words were presented without any additional information (e.g. garderie).4

Four of the six experimental conditions included 15 words whereas the two remaining ones included 14 words.5 There was therefore a total of 88 experimental items in the study. Table 4 illustrates each condition using items that were featured in the stimulus set. 27 filler items were used in addition. The fillers featured schwa in morpheme-internal position.

Table 4
Table 4

Experimental items.

For each word, the schwa variant was conveyed using the word’s graphic form (e.g. garderie). The graphic form always contains an e corresponding to the schwa phone [ə]. The schwa-less variant was conveyed by replacing the e by the apostrophe (e.g. gard’rie). The order of presentation of the experimental items and fillers was randomized.

3.1.3 Participants

21 Swiss French speakers (8 females, 13 males; recruited among students at a Swiss university) and 34 French speakers from France (27 females, 7 males; recruited online via the CNRS’s platform RISC) participated in the study online, using the LimeSurvey platform (LimeSurvey 2012). The participants provided their informed consent to participate in the research and agreed to make their data available online. No sensitive information about participants was collected.

3.1.4 Data analyses

While it is common practise to analyze ordinal data such as Likert-scale data as metric variables using linear regression, Liddell & Kruschke (2018) show that this can lead to a number of errors, including false alarms (i.e. detecting an effect that is not real), misses (i.e. failure to detect real effects), and even inversions of effects (i.e. the order of the means according to the metric scale is opposite to the true ordering of the means). One of the major problem with this metric approach is that it assumes that the distance between the response categories is equidistant whereas this might not be necessarily the case. For instance, in the case of a seven-point Likert scale, the distance between 5 and 6 might be treated differently from the distance between 2 and 3 in the participant’s mind, even though the two distances are both equal to one on a metric scale. To avoid these issues, Liddell & Kruschke (2018) recommend to analyze Likert-scale data using ordinal instead of linear regression models.

In this paper, the judgment data were modeled using the ordinal cumulative model (Bürkner & Vuorre 2019: 78–79). The cumulative model assumes that the observed ordinal response variable derives from the categorization of a latent continuous unobserved variable. In the present study, the ordinal variable is the rating of the preference for the schwa or schwa-less variant along the seven-point scale. The latent variable is the participant’s underlying opinion about the relative frequency of the two variants. To model this categorization in the case of a seven-point Likert scale, the cumulative model assumes that there are six thresholds which partition the latent variant variable into seven ordered categories (1, 2, …, 6, 7). The model provides estimates both for the different conditions’ means along the latent continuous variable and the position of the six thresholds. The reader is referred to Bürkner & Vuorre (2019) for further details.

A Bayesian approach was adopted (rather than a frequentist approach) for inferring the parameters of both the ordinal regression and the probabilistic grammars. This choice was motivated by the fact that Bayesian inference yields outcomes that are intuitive and easy to interpret. In particular, it provides a posterior distribution for all the model’s parameters and combinations of parameter values given the data. This makes it very easy to test any hypothesis about the parameter values and about differences between parameter values. Also, Bayesian approaches virtually always converge to accurate values of the parameters (Liddell & Kruschke 2018).

3.2 Results

3.2.1 Description of analysis

A Bayesian hierarchical ordinal cumulative regression was fit to the seven-point Likert-scale data as a function of dummy-coded factors Morphology (reference level ‘derivation’), Cluster (reference level OL_C), and Origin (reference level ‘France’) and all their interactions, using Stan (Carpenter et al. 2017) and the brms package (Bürkner 2017) in R (R Core Team 2020). The model included the maximal random effect structure justified by the study’s design (Barr et al. 2013), allowing the effects and their interactions to vary by participant (Morphology, Cluster) and by word (Origin).6 The probit link function was used in order to apply a cumulative model assuming the latent variable to be normally distributed (Bürkner & Vuorre 2019: 84). The default priors of the brms package were used. Equal variances were assumed for the unobserved variables that underlie the observed ordinal variable.

Four sampling chains with 4,000 iterations with a warm-up period of 2,000 iterations for each chain were run, resulting in a total of 8000 samples. To avoid initialization at too small or too large values, initial values for the MCMC sampler were set to zero.7

For all relevant parameters, their mean and 95% credibility interval (CI) according to the model’s posterior distribution are reported. In the analysis, the parameters concern the latent unobserved continuous variable corresponding to participants’ opinion about the likelihood of schwa absence. Due to the way the Likert-scale was set up, greater values correspond to a greater likelihood of schwa deletion (according to the participants). For testing hypotheses about the difference Δ between two conditions, Franke & Roettger (2019)’s recommendations were followed. The posterior probability that this difference is larger than zero (Δ > 0) is reported. If this probability is close to 1 and furthermore zero is outside of the posterior 95% CI for Δ, compelling evidence is considered to be provided for the hypothesis that posits the existence of a difference between the relevant conditions.

3.2.2 Description of results

Figure 2 shows the posterior distribution (mean and 95% CI) of each response category (1, 2, …, 6, 7) for all cells in the factorial design. This posterior distribution was calculated using Equation 5 in Bürkner & Vuorre (2019: 79). This equation expresses the probability of each response category k as a function of the predictors, their corresponding regression coefficients, and the thresholds τk and τk-1 inferred along the latent continuous variable.

Figure 2
Figure 2

Posterior distribution (mean and 95% CI) of each of the seven response categories as a function of Morphology, Cluster, and Origin.

Differences between clusters in derived words. Participants were found to rate schwa absence as more likely in LO_L than in OL_C in derivation and there is compelling evidence for this difference for participants from both France (𝔼(μFrench, LO_L, derμFrench, OL_C, der) = 0.54, CI = [0.15,0.93], P(Δ > 0) = 1) and Switzerland (𝔼 (μSwiss, LO_L, derμSwiss, OL_C, der) = 0.65, CI = [0.25,1.08], P(Δ > 0) = 1). 8 Participants were found to rate schwa absence as more likely in C_C than in LO_L in derivation and there is compelling evidence for this difference for participants from both France (𝔼(μFrench, C_C, der-μFrench, LO_L, der) = 2.49, CI = [2.02,2.99], P(Δ > 0) = 1) and Switzerland (𝔼(μSwiss, C_C, der-μSwiss, LO_L, der) = 2.28, CI = [1.74,2.85], P(Δ > 0) = 1).

Controlling for individual differences in the treatment of three-consonant sequences in derived words. As discussed in section 2.4, only some speakers might be driving the general asymmetry between OL_C and LO_L in derived words (stem level). Figure 3 shows how each individual participant in Study 1 treats OL_C and LO_L in derived words. The results confirm the hypothesis of individual variation at the stem level: some participants are more likely to delete schwa in LO_L than in OL_C (participants on the left side of the figure) whereas some participants do not treat the two sequences differently (participants on the right side of the figure).

Figure 3
Figure 3

Posterior distribution (mean and 95% CI) of the difference between LO_L and OL_C in derived words (stem level) by participant. A positive value means that schwa deletion is more likely in LO_L than in OL_C.

Differences between clusters in inflected words. Participants were also found to rate schwa absence as more likely in LO_L than in OL_C in inflection and there is compelling evidence for this difference for participants from both France (𝔼(μFrench, LO_L, inf-μFrench, OL_C, inf) = 1.37, CI = [0.92, 1.78], P(Δ > 0) = 1) and Switzerland (𝔼(μSwiss, LO_L, inf-μSwiss, CO_L, inf) = 1.11, CI = [0.63, 1.57], P(Δ > 0) = 1). Participants were found to rate schwa absence as more likely in C_C than in LO_L in inflected words and there is compelling evidence for this difference for participants from both France (𝔼(μFrench, C_C, inf-μFrench, LO_L, inf) = 1.10, CI = [0.74, 1.45], P(Δ > 0) = 1) and Switzerland (𝔼(μSwiss, C_C, inf-μSwiss, LO_, inf) = 1.07, CI = [0.67, 1.48], P(Δ > 0) = 1).

It can be concluded that there is sufficient evidence to support the hypothesis that the number and nature of surrounding consonants are revelant in both derived and inflected words, with OL_C being judged overall as more likely to feature schwa than LO_L and LO_L more likely to feature schwa than C_C in both derived and inflected words. However there is individual variation at the stem level, as some speakers do not treat OL_C and LO_L differently in this context.

Differences between derivation and inflection. For OL_C clusters, participants were found to rate schwa absence as more likely in inflection than in derivation, but there is compelling evidence for this difference for participants from Switzerland (𝔼(μSwiss, OL_C, inf-μSwiss, OL_C, der) = 0.40, CI = [0.05, 0.74], P(Δ > 0) = 0.99) but not for participants from France (𝔼(μFrench, OL_C, inf-μFrench, OL_C, der) = 0.05, CI = [–0.29, 0.39], P(Δ > 0) = 0.62). For LO_L clusters, participants were found to rate schwa absence as more likely in inflection than in derivation, and there is compelling for this difference for participants from both France (𝔼(μFrench, LO_L, inf-μFrench, LO_L, der) = 0.88, CI = [0.48, 1.28], P(Δ > 0) = 1) and Switzerland (𝔼(μSwiss, OL_C, inf-μSwiss, OL_C, der) = 0.86, CI = [0.43, 1.30], P(Δ > 0) = 1). An unexpected result was obtained for C_C clusters. In this context, participants were found to rate schwa absence as less likely in inflection than in derivation, with compelling evidence from participants from both France (𝔼(μFrench, C_C, inf-μFrench, C_C, der) = –0.51, CI = [–0.81, –0.19], P(Δ > 0) = 0) and Switzerland (𝔼 (μSwiss, C_C, inf-μSwiss, C_C, der) = –0.35, CI = [–0.69, 0.01], P(Δ > 0) = 0.03). See section 3.3 for further discussion.

Differences between speakers from France and Switzerland. Speakers from Switzerland systematically rate the schwa-less variant higher than speakers from France, and this holds across all six morphophonological contexts: for OL_C sequences in derived words (𝔼(μSwiss, OL_C, der-μFrench, OL_C, der) = 1.22, CI = [0.62, 1.84], P(Δ > 0) = 1), for LO_L sequences in derived words (𝔼(μSwiss, LO_L, der-μFrench, LO_L, der) = 1.33, CI = [0.71, 1.93], P(Δ > 0) = 1), for C_C sequences in derived words (𝔼(μSwiss, C_C, der-μFrench, C_C, der) = 1.11, CI = [0.40, 1.91], P(Δ > 0) = 1), for OL_C sequences in inflected words (𝔼(μSwiss, OL_C, inf-μFrench, OL_C, inf) = 1.57, CI = [0.96, 2.25], P(Δ > 0) = 1), for LO_L sequences in inflected words (𝔼(μSwiss, LO_L, inf-μFrench, LO_L, inf) = 1.31, CI = [0.57, 2.00], P(Δ > 0) = 1), and for C_C sequences in inflected words (𝔼(μSwiss, C_C, inf-μFrench, C_C, inf) = 1.27, CI = [0.51, 2.08], P(Δ > 0) = 1).

3.3 Discussion

The results of Study 1 are summarized in Table 5 for the six morphophonological contexts, with > indicating a greater estimated likelihood of the schwa variant.

Table 5
Table 5

Summary of Study 1 results: probability of the schwa variant as a function of Cluster, Morphology, and Origin (France vs. Switzerland).

The results of Study 1 support the hypothesis that phonologies are weakly distinct across stem and word levels (see section 2.4) against the hypothesis that they differ strongly (see section 2.3). At the stem level (derivation), not only the number but also the nature of surrounding consonants matters for schwa-zero alternations, at least for some speakers from both France and Switzerland. This means that Grammont’s Law of Three Consonants should be relaxed not only at the word level but also at the stem level. At the grammatical level, the fact that some speakers show sensitivity to the nature of consonants involved in the CC_C sequence in derived words is problematic for the view that only a single *CCC constraint is active at the stem level (see the modeling study in section 5 for further confirmation). By contrast, the absence of effect for other speakers is not particularly problematic for the view that different phonotactic constraints (e.g. *OLC, *LOL) are active at the stem level: indeed, the relevant constraints could have the same (or very similar) weights for these speakers.

Furthermore, the relative markedness of clusters was found to be the same in both derived and inflected words, with OLC being more strongly avoided than LOL and LOL being more strongly avoided than CC. This is in line with the hypotheses presented in section 2.6.2. This is also consistent with phonetically based theories of phonotactic constraints that hold that asymmetries between phonotactic constraints ultimately derive from perceptual/articulatory or sonority asymmetries and therefore predict that these phonotactic asymmetries should be consistent across domains.

Three-consonant clusters were found to be more strongly avoided in derived than in inflected words, in line with the hypothesis that phonotactic restrictions are stronger in lower than in higher morphosyntactic domains (see section 2.6.1). This is consistent with the claim that there are genuine effects of morphosyntactic domains on schwa-zero alternations below the word level (Dell 1977).

However, this result did not extend to two-consonant clusters: two-consonant clusters were unexpectedly found to be more strongly avoided in inflected than in derived words, and for both speakers from France and Switzerland. This result is unexpected in two ways. The literature on French indeed does not report any asymmetry between inflection and derivation in schwa likelihood for two-consonant clusters (e.g. Côté 2001: 85). Moreover, if any asymmetry was to be observed, one would expect it to go in the opposite direction, with derivation favoring schwa presence more than inflection. Indeed, this is what has been observed for three-consonant clusters in the present study and this is also what is expected under the general hypothesis that schwa is more likely at the boundary of lower morphosyntactic domains (Dell 1977; Côté 2001). A follow-up analysis was run to test whether this could be due to uncontrolled differences in sonority in the stimuli (sonority-increasing C_C sequences favor schwa deletion). But the same results were found.

One possibility to explore in further research would be that derived words are more likely to be analyzed as morphologically simple than inflected words, and even more when the cluster at the stem-suffix boundary is simple (C_C). Derived words often have less transparent semantics than inflected words, making their morphological structure less salient (Haspelmath & Sims 2010). The presence of a simple consonant cluster (C_C) at the stem-suffix boundary could make the morpheme boundary even less salient, therefore favoring even further a non-decompositional analysis of derived words (Hay 2003). As a result, morpheme-boundary schwas could end up being reanalyzed as being morpheme-internal in (some) derived words. Contrary to morpheme-boundary schwas, morpheme-internal schwas have to be analyzed as underlying, because their distribution is not entirely predictable (e.g. pelouse [p(ə)luz] ‘lawn’ vs. plus [p(*ə)lys] ‘more’). If schwas are sufficiently unlikely in derived words with C_C sequences at the stem-suffix boundary, speakers could then tend to reanalyze these words as featuring a morpheme-internal CC sequence (without any underlying schwa). If a faithfulness constraint specifically penalizes schwa epenthesis in morpheme-internal consonant clusters (e.g. contiguity) and has a larger weight than the general constraint penalizing schwa epenthesis, then schwa could end up being more likely in inflected words with C_C at the stem-suffix boundary than in derived words reanalyzed as featuring a morpheme-internal CC sequence, despite clusters being generally more marked at the stem level than at the word level. Exploring the detailed predictions of this account (in particular whether derived words with C_C are more likely to be analyzed as morphologically simple than inflected words) is left for further research.

The finding that Swiss French speakers are systematically more likely to accept the schwa-less variant than French speakers from France is in line with the conclusions reached by Racine (2007; 2008) based on judgment data and by Racine et al. (2016) (among others) based on production data. Despite these systematic differences, participants from France and Switzerland behaved remarkably similarly with respect to the hypotheses studied in this paper, except for the treatment of OL_C in derived vs. inflected words (see Table 5). This similarity between the two groups of participants is line with Racine (2007)’s observation that speakers from France and Switzerland differ in their baseline rates of schwa production but otherwise follow the same general principles for schwa-zero alternations.

4 Study 2

Study 2 further tests the hypotheses of strongly vs. weakly distinct phonologies across stem and word levels, specifically controlling for an effect that was not controlled for in Study 1, namely the effect of stem length on schwa-zero alternations. Because French speakers from Switzerland and France were found to behave similarly with respect to the Law of Three Consonants in Study 1, Study 2 focuses on a single variety (Swiss French). The methods are described in section 4.1. The results are presented in section 4.2 and discussed in section 4.3. The data (study2-data.RData) and R code (study2-code.R) are available on OSF (https://osf.io/5hvxs/).

4.1 Methods

4.1.1 Judgment task

The same judgment task was used as in Study 1. The reader is referred to section 3.1.1 for further details.

4.1.2 Experimental items and fillers

Three variables were manipulated to construct the experimental items: Cluster (with two levels: OL_C, LO_L), Morphology (with two levels: derivation, inflection), and Stem Length (with two levels: monosyllabic stem, disyllabic stem). C_C clusters were not considered in this follow-up study because they are not directly relevant to the paper’s main research question. Inflected words were all presented with a subject pronoun preceding them (e.g. je chanterai ‘I will sing’) to ensure that they were correctly identified as inflected words. Whereas derived words were presented without additional information in Study 1, they were preceded by a determiner in Study 2 (e.g. la garderie). This change was implemented to address a reviewer’s concern that the fact that inflected and derived words were presented differently in Study 1 could have affected the results.

Each of the eight experimental conditions included 10 words, for a total of 80 experimental items. Table 6 illustrates each condition using items that were featured in the stimulus set. 55 filler items were used in addition. The fillers featured words with schwa in morpheme-internal position as well as inflected and derived words with C_C sequences.

Table 6
Table 6

Study 2: experimental items.

As in Study 1, the schwa variant was conveyed using the word’s graphic form (e.g. la garderie). The schwa-less variant was conveyed by replacing the e by the apostrophe (e.g. la gard’rie). The order of presentation of the experimental items and fillers was randomized.

4.1.3 Participants

40 Swiss French speakers (22 females, 18 males; recruited among students at a Swiss university) participated in the study online, using the LimeSurvey platform (LimeSurvey 2012). The participants provided their informed consent to participate in the research and agreed to make their data available online. No sensitive information about participants was collected.

4.1.4 Data analyses

The judgment data were modeled using the ordinal cumulative model (Bürkner & Vuorre 2019: 78–79), as in Study 1. The reader is referred to section 3.1.4 for details.

4.2 Results

4.2.1 Description of analysis

A Bayesian hierarchical ordinal cumulative regression was fit to the seven-point Likert-scale data as a function of dummy-coded factors Morphology (reference level ‘derivation’), Cluster (reference level OL_C), and Stem Length (reference level ‘monosyllabic’) and all their interactions, using Stan (Carpenter et al. 2017) and the brms package (Bürkner 2017) in R (R Core Team 2020). The model included the maximal random effect structure justified by the study’s design (Barr et al. 2013), allowing the effects and their interactions to vary by participant (Morphology, Cluster, Stem Length) and by word (random intercept by word). An alternative model with word frequency as an additional variable was fit to the data, but this variable was not found to have a significant effect on schwa-zero alternations (β = 0.10, 95% CI = [–0.03, 0.22]), as in Study 1. The details of the analysis are exactly the same as in Study 1. The reader is referred to section 3.2.1 for further details.

4.2.2 Description of results

Figure 4 shows the posterior distribution (mean and 95% CI) of each response category (1, 2, …, 6, 7) for all cells in the factorial design.

Figure 4
Figure 4

Posterior distribution (mean and 95% CI) of each of the seven response categories as a function of Morphology, Cluster, and Stem Length.

Differences between clusters in derived words. Participants were found to rate schwa absence as more likely in LO_L than in OL_C in derivation and there is compelling evidence for this difference both in monosyllabic stems (𝔼(μmonosyll, LO_L, der-μmonosyll, OL_C, der) = 0.39, CI = [0.01, 0.75], P(Δ > 0) = 0.98) and disyllabic stems (𝔼(μdisyll, LO_L, der-μdisyll, OL_C, der) = 0.79, CI = [0.41, 1.17], P(Δ > 0) = 1).

Controlling for individual differences in the treatment of three-consonant sequences in derived words. As discussed in section 2.4 and as found in Study 1, only some speakers might be driving the general asymmetry between OL_C and LO_L in derived words (stem level). Figure 5 shows how each individual participant in Study 2 treats OL_C and LO_L in derived words with monosyllabic and disyllabic stems. The results confirm the hypothesis of individual variation at the stem level. Some participants are more likely to delete schwa in LO_L than in OL_C whereas some participants do not treat the two sequences differently. Generally, participants are more likely to treat the two clusters differently with disyllabic stems than with monosyllabic stems. There is also one outlier who is more likely to delete schwa in OL_C than in LO_L (Participant 33).

Figure 5
Figure 5

Posterior distribution (mean and 95% CI) of the difference between LO_L and OL_C in derived words (stem level) by participant and stem length (monosyllabic, disyllabic). A positive value means that schwa deletion is more likely in LO_L than in OL_C.

Differences between clusters in inflected words. Participants were also found to rate schwa absence as more likely in LO_L than in OL_C in inflection and there is compelling evidence for this difference both in monosyllabic stems (𝔼(μmonosyll, LO_L, inf-μmonosyll, OL_C, inf) = 1.29, CI = [0.88, 1.73], P(Δ > 0) = 1) and in disyllabic stems (𝔼(μdisyll, LO_L, inf-μdisyll, CO_L, inf) = 1.34, CI = [0.90, 1.82], P(Δ > 0) = 1).

It can be concluded that there is sufficient evidence to support the hypothesis that the number and nature of surrounding consonants are revelant in both derived and inflected words, with OL_C being judged overall as more likely to feature schwa than LO_L in both derived and inflected words. However there is individual variation at the stem level, as some speakers do not treat OL_C and LO_L differently in this context, in particular in monosyllabic stems.

Differences between derivation and inflection. For OL_C clusters, participants were not found to rate schwa absence as more likely in inflection than in derivation, neither in monosyllabic stems (𝔼(μmonosyll, OL_C, inf-μmonosyll, OL_C, der) = –0.21, CI = [–0.55, 0.16], P(Δ > 0) = 0.12) nor in disyllabic stems (𝔼(μdisyll, OL_C, inf-μmonosyll, OL_C, der) = –0.02, CI = [–0.38, 0.32], P(Δ > 0) = 0.47). For LO_L clusters, participants were found to rate schwa absence as more likely in inflection than in derivation, and there is compelling for this difference in both monosyllabic stems (𝔼(μmonosyll, LO_L, inf-μmonosyll, LO_L, der) = 0.68, CI = [0.31, 1.04], P(Δ > 0) = 1) and disyllabic stems (𝔼(μSwiss, OL_C, inf-μSwiss, OL_C, der) = 0.53, CI = [0.17, 0.89], P(Δ > 0) = 1).

Differences between monosyllabic stems and disyllabic stems. Participants generally rated the schwa-less variant higher in disyllabic stems than in monosyllabic stems. For OL_C clusters in derived words, there is no compelling evidence for this asymmetry (𝔼(μmonosyll, OL_C, der-μdisyll, OL_C, der) = 0.11, CI = [–0.21, 0.44], P(Δ > 0) = 0.75). However, disyllabic stems were found to favor schwa absence in the three other contexts: in inflected words with OL_C clusters (𝔼(μmonosyll, OL_C, inf-μdisyll, OL_C, inf) = 0.31, CI = [0.00, 0.66], P(Δ > 0) = 0.97), in derived words with LO_L clusters (𝔼(μmonosyll, LO_L, der-μdisyll, LO_L, der) = 0.51, CI = [0.17, 0.86], P(Δ > 0) = 1), and in inflected words with LO_L clusters (𝔼(μmonosyll, LO_L, inf-μdisyll, LO_L, inf) = 0.36, CI = [0.03, 0.70], P(Δ > 0) = 0.98).

4.3 Discussion

The results of Study 2 are summarized in Table 7 for the eight morphophonological contexts, with > indicating a greater estimated likelihood of the schwa variant.

Table 7
Table 7

Summary of Study 2 results: probability of the schwa variant as a function of Cluster, Morphology, and Stem Length.

Study 2 mainly replicates the results of Study 1. The results indeed support the hypothesis that phonologies are weakly distinct across stem and word levels (see section 2.4) against the hypothesis that they differ strongly (see section 2.3). At the stem level (derivation), not only the number but also the nature of surrounding consonants matters for schwa-zero alternations, at least for some speakers. This further supports the claim that Grammont’s Law of Three Consonants should be relaxed not only at the word level but also at the stem level.

Furthermore, the relative markedness of clusters was found to be the same in both derived and inflected words, with OLC being more strongly avoided than LOL, in line with the hypotheses presented in section 2.6.2 and with the results of Study 1. LOL clusters were found to be more strongly avoided in derived than in inflected words, in line with the hypothesis that phonotactic restrictions are stronger in lower than in higher morphosyntactic domains (see section 2.6.1) and with the results of Study 1. This is consistent with the claim that there are genuine effects of morphosyntactic domains on schwa-zero alternations below the word level (Dell 1977). No effect was observed for OLC clusters though. This might be interpreted as a ceiling effect, as the likelihood of schwa presence was high in this context.

The finding that monosyllabic stems are more likely to feature a schwa at the stem-suffix boundary than disyllabic stems is also consistent with the results of earlier studies showing an avoidance of adjacent prosodically prominent syllables in French (see section 2.6.3).

5 Grammatical modeling

The analysis of the judgment data was also supplemented with a linguistic analysis using probabilistic constraint-based grammars. In this framework, the likelihood of schwa presence/absence in the different experimental conditions can be directly interpreted in terms of constraint weights. This makes it possible to interpret the judgment data in terms of the relative strengths of phonotactic constraints against consonant clusters. In this section, the judgment data were aggregated across participants of a given French variety and across words.

5.1 Methods

For the constraint-based analysis, the response variable (the 7-point Likert scale) was transformed into a binary variable (schwa presence vs. absence). The reason for this transformation is that constraint-based grammars are designed as models of language production (a form is produced or not) rather than as models of metalinguistic judgment. In language production, a form is produced or not. In judgment data, a form may receive a gradient judgment of acceptability and this does not directly translate into a binary choice (unless binary judgments are collected). However constraint-based grammars may be used and are often used to model judgment data (e.g. Boersma & Hayes 2001 on dark and light /l/ in English, Smith & Pater 2020 on schwa-zero alternations in French). If the judgment data are not binary, this requires applying a transformation that binarizes the data (e.g. Boersma & Hayes 2001: 82). In this paper, the following transformation was applied. Words that received ratings strictly above 4 were treated as categorically favoring the schwa-less variant. Words that received ratings strictly below 4 were treated as categorically favoring the schwa variant. Words that received a rating equal to 4 were randomly assigned to one or the other category.

For each study (Study 1 and Study 2), two constraint-based grammars were fit to the transformed data aggregated across participants and words, using MaxEnt as grammatical framework (Hayes & Wilson 2008). Two grammars were constructed to represent the hypothesis of weakly distinct phonologies across stem and word levels and the hypothesis of strongly distinct grammars. The first grammar had a different markedness constraint for each of the six cluster-suffix combinations (*OLCinf, *LOLinf, *CCinf, *OLCder, *LOLder, *CCder), allowing for OL_C and LO_L to behave differently in derived words. The second grammar was identical except that it had a single *CCC constraint for derived words, in accordance with the hypothesis that the Law of Three Consonants is categorical in this context (*OLCinf, *LOLinf, *CCinf, *CCCder, *CCder).9

All four grammars (the two grammars in Study 1 and the two grammars in Study 2) also included a faithfulness constraint protecting against schwa epenthesis: Dep(ə). In Study 1, there were two indexed versions of Dep(ə), one for each French variety: Dep(ə)France and Dep(ə)Switzerland (see section 2.6.4). In Study 2, there was a single Dep(ə) constraint (because only Swiss French speakers participated in this study) but an additional *Clash constraint was added in the analysis to account for the difference between monosyllabic and disyllabic stems (see section 2.6.3).

The constraint weights of all four grammars were inferred using a Bayesian binomial regression implemented in rjags (Plummer 2016). To help with model convergence, one of the weights was set to a constant value of 1 (the weight of *CCder in Study 1, the weight of Dep(ə) in Study 2). Following Goldwater & Johnson (2003), a Gaussian prior with mean equal to zero was chosen for all other constraint weights. Informally, this prior specifies that zero is the default weight for constraints (which means that the constraint has no effect on the output). The variance of the Gaussian prior was set to 1,000. Three MCMC chains were used with 100,000 samples and a thinning interval of 10 (which means that every 10th value in the chain was kept in the final MCMC sample while all other values were discarded). The first 5,000 samples of each chain were used for burn-in (which means they were also discarded). Convergence of the chains on the posterior distribution was assessed using the Gelman-Rubin statistic: it was very close to 1 for all parameters,10 indicating that the samples were representative of the posterior distribution (Kruschke 2015: 181). The effective sample size for each constraint weight estimated by the model was superior to 10,000, indicating that the MCMC samples were large enough for stable and accurate numerical estimates of the posterior distributions (Kruschke 2015: 184). For model comparison, the deviance information criterion (DIC; Gelman et al. 2013: 172–173) was used.

5.2 Results for Study 1

The posterior distributions for the constraint weights are shown in Table 8a and 8b for the grammar that distinguishes three-consonant clusters in derived words and for the grammar that does not, respectively. Figure 6 shows the frequencies that each grammar predicts for the schwa variant in the 12 contexts (6 morphophonological contexts × 2 varieties) against the frequencies attested in Study 1. The grammar with the same phonotactic constraints across word and stem levels (Table 8a, Figure 6a) was found to have a smaller deviation information criterion (Δ = –18.02) than the grammar with distinct phonotactic constraints in the two levels (Table 8b, Figure 6b), indicating that the increase in goodness of fit is worth the added complexity in the grammar. In other words, the data provide evidence for constraints referencing the nature of consonants in CCC clusters even at the stem level.

Table 8

Posterior distribution of the constraint weights (mean and 95% CI) in the two grammars.

(a) Grammar with *OLCder and *LOLder (b) Grammar with *CCCderivation
Constraint Mean 95% CI Constraint Mean 95% CI
*OLCder 4.76 [4.47, 5.06] *OLCinf 4.48 [4.20, 4.76]
*OLCinf 4.48 [4.20, 4.76] *CCCder 4.39 [4.16, 4.64]
*LOLder 4.08 [3.82, 4.36] Dep(ə)Switzerland 3.23 [3.02, 3.45]
Dep(ə)Switzerland 3.24 [3.02, 3.45] *LOLinf 2.69 [2.46, 2.92]
*LOLinf 2.69 [2.45, 2.92] Dep(ə)France 1.81 [1.64, 1.99]
Dep(ə)France 1.81 [1.63, 1.99] *CCinf 1.36 [1.12, 1.59]
*CCinf 1.36 [1.12, 1.59] *CCder 1.00
*CCder 1.00
Figure 6
Figure 6

Data vs. predictions for the two grammars. (a) Grammar with *OL_Cder and *LO_Lder. (b) Grammar with *CCCder.

Moreover, most of the grammatical hypotheses presented in section 2.6 are supported by the results of the modeling study. As expected under the hypothesis that *OLC is more marked than *LOL and *LOL more marked than *CC (see section 2.6.2), *OLC was found to have a greater weight than *LOL and *LOL a greater weight than *CC at both stem and word levels, as shown in Table 8a. As expected under the hypothesis that phonotactic constraints are more strongly enforced at the stem level than at the word level (see section 2.6.1), *OLC and *LOL were found to have greater weights in derived than in inflected words. The unexpected result found in section 3.2.2 was replicated at the grammatical level: the weight of *CCinf is larger than the weight of *CCder, meaning that phonotactic constraints referring to two-consonant clusters are more strongly enforced at the word level (see section 3.3 for a potential explanation for this unexpected effect). Finally, as expected (see section 2.6.4), French speakers from Switzerland were found to weigh Dep(ə) higher than French speakers from France.

5.3 Results for Study 2

The posterior distributions for the constraint weights are shown in Table 9a and 9b for the grammar that distinguishes three-consonant clusters in derived words and for the grammar that does not, respectively. Figure 7 shows the frequencies that each grammar predicts for the schwa variant in the 8 contexts (2 morphological contexts × 2 stem lengths × 2 three-consonant clusters) against the frequencies attested in Study 2. The grammar with the same phonotactic constraints across word and stem levels (Table 9a, Figure 7a) was found to have a smaller deviation information criterion (Δ = –77.36) than the grammar with distinct phonotactic constraints in the two levels (Table 9b, Figure 7b), indicating that the increase in goodness of fit is worth the added complexity in the grammar. In other words, the data provide evidence for constraints referencing the nature of consonants in CCC clusters even at the stem level.

Figure 7
Figure 7

Data vs. predictions for the two grammars. (a) Grammar with *OL_Cder and *LO_Lder. (b) Grammar with *CCCder.

Table 9

Posterior distribution of the constraint weights (mean and 95% CI) in the two grammars.

(a) Grammar with *OLCder and *LOLder (b) Grammar with *CCCderivation
Constraint Mean 95% CI Constraint Mean 95% CI
*OLCder 2.77 [2.55, 2.99] *OLCinf 2.64 [2.43, 2.86]
*OLCinf 2.63 [2.43, 2.85] *CCCder 2.11 [1.97, 2.25]
*LOLder 1.63 [1.47, 1.81] Dep(ə) 1.00
Dep(ə) 1.00 *LOLinf 0.92 [0.76, 1.08]
*LOLinf 0.92 [0.76, 1.08] *Clash 0.41 [0.25, 0.58]
*Clash 0.42 [0.26, 0.59]

Moreover, most of the grammatical hypotheses presented in section 2.6 are supported by the results of the modeling study. *OLC was found to have a greater weight than *LOL, as shown in Table 9a. As expected under the hypothesis that phonotactic constraints are more strongly enforced at the stem level than at the word level (see section 2.6.1), *OLC and *LOL were found to have greater weights in derived than in inflected words. Finally, *Clash was found to have a non-zero weight, meaning that the language does avoid sequences of stressed syllables, as expected (see section 2.6.4).

6 Conclusion

Grammont’s influential Law of Three Consonants (LTC) states that schwa is obligatorily pronounced in CC_C sequences in French to avoid three-consonant clusters. Although the LTC has been shown to depend on the nature and order of consonants in CC_C at the word and phrase levels, Grammont’s categorical formulation is still generally considered as accurate to describe schwa-zero alternations at the stem level. The judgment data collected in the two studies presented in this paper support the hypothesis that not only the number but also the nature of surrounding consonants matters for schwa-zero alternations at the stem level (in derived words), at least for some speakers. This means that Grammont’s Law of Three Consonants should be relaxed not only for word and phrase levels but also at the stem level. Furthermore, the same phonotactic asymmetries were found across levels. This is compatible with theories of phonotactics that hold that phonotactic asymmetries are not arbitrary but rooted in extragrammatical factors such as perception, articulatory effort or sonority.

The results also replicate some earlier findings about schwa-zero alternations. In particular, obstruent-liquid-consonant clusters were found to be more strongly avoided than liquid-obstruent-liquid clusters, in line with findings that French follows the sonority sequencing principle. Schwa was found to be more likely to break a three-consonant cluster than a two-consonant cluster, in line with earlier findings that cluster size matters in schwa-zero alternations. Schwa was found to be more likely to be pronounced in monosyllabic stems than in disyllabic stems, in line with previous findings on clash avoidance in French. Schwa variants of words were found to be generally more acceptable by French speakers from France than from Switzerland, in line with earlier findings on the difference between the two varieties. Finally, schwa was generally found to be more likely to be pronounced in derived words (at the stem level) than in inflected words (at the word level), in line with the hypothesis that lower morphosyntactic domains favor schwa presence. This result is particularly interesting because it means that there are genuine morphosyntactic effects on schwa-zero alternations below the word level and therefore that asymmetries among domains cannot be all reduced to prosodic effects. One exception to the generalization that derivation favors schwa presence more than inflection was the case of C_C sequences at the stem-suffix boundary, where the opposite effect was observed (schwa was more likely in inflected than in derived words). This unexpected effect was hypothesized to be due to the greater morphological decomposability of inflected words, with derived words with simple consonant clusters at the stem-suffix boundary being more likely to be reanalyzed as monomorphemic than inflected words. This hypothesis should be explored further in future work.

Finally, the modeling study showed that it is possible to get a very good match to the judgment data using the Maxent framework for constraint-based grammars. This is line with previous research showing that this framework is well adapted to deal with phonological variability in general (e.g. Zuraw & Hayes 2017; Smith & Pater 2020).

Notes

  1. Or equivalently phonotactic constraints referencing different types of three-consonant clusters (e.g. *OLC, *COL, etc.) would have different weights at the phrase and word levels but always exactly the same weights at the stem level. [^]
  2. It would be possible to analyze the schwa variant as underlying and penalize its deletion with a Max(ə) constraint. However an additional constraint beyond Max(ə) and the three markedness constraints in (7b) would be required to motivate deletion (e.g. *ə), resulting in a slightly more complex analysis (five instead of four constraints). [^]
  3. Note that no such asymmetry between derivation and inflection is reported for two-consonant clusters. For instance, Côté (2001: 85) describes schwa as excluded (or at least unlikely) in C_C in both derived and inflected words, without reporting any difference in schwa likelihood in the two cases (e.g. fruiterie ‘fruit store’ and je gâterai ‘I will spoil’). [^]
  4. In Study 2, the derived words are presented with a determiner to answer a reviewer’s worry that the presence/absence of a clitic before the word might have an effect on the likelihood of the schwa variant. [^]
  5. This difference is due to an error when typing the stimuli in the online platform. However this error is not problematic because the statistical analysis does not require the same number of observations per condition. [^]
  6. Following a reviewer’s advice, a model including word frequency as a fixed effect and as a by-participant random slope was also fit to the data. The word frequency measure was obtained from Lexique 3.83 (New et al. 2007). It corresponds to the frequency of the word per million of occurrences in a corpus of movie subtitles. Following Eychenne (2019), this frequency was log-transformed (using the following formula: log(x+1), where x stands for Lexique 3.83’s word frequency. 1 was added to avoid infinite values for words that are not attested in the corpus). However word frequency did not appear as a significant predictor in this model (β = –0.09, 95% CI = [–0.27, 0.07]). Therefore the simpler model that does not include word frequency was used for hypothesis testing. [^]
  7. This issue is discussed by Paul Bürkner on the Stan forums (https://discourse.mc-stan.org/t/initialization-error-try-specifying-initial-values-reducing-ranges-of-constrained-values-or-reparameterizing-the-model/4401). [^]
  8. Notation 𝔼() is a shorthand for the expectation (mean) of the posterior distribution of interest. [^]
  9. In this paper, markedness hierarchies are set up as scale-partition constraint families and not as stringency constraint families (see Smith & Moreton 2012 for a discussion of these two approaches). In the stringency approach, there would be one markedness constraint banning specific clusters (e.g. *OLC) and a general markedness constraint banning all CCC clusters (*CCC) instead of two specific markedness constraints (*OLC, *LOL). Similarly, in the stringency approach, there would be a morphologically indexed markedness constraint (e.g. *OLCder) and a general markedness constraint that does not depend on morphological domains (*OLC) instead of two morphologically indexed markedness constraints (*OLCder, *OLCinf). Specific constraints were chosen in all cases so as not to bias the analysis in one way or the other (e.g. OLC is not a priori assumed to be more marked than LOL, clusters are not a priori assumed to be more marked in derivation than in inflection). Constraint weights only (and not constraint violations) will determine whether one context is more marked than the other. [^]
  10. The Gelman-Rubin statistics was calculated individually for each parameter and not globally for all parameters because one of the parameters (the weight of *CCder) was set to a constant value of 1 to help with model convergence and a global Gelman-Rubin statistics cannot be computed in this case. See the following post by Martyn Plummer for more details: https://sourceforge.net/p/mcmc-jags/discussion/610037/thread/28cef6e5/. [^]

Acknowledgements

I would like to thank Marie-Hélène Côté and the participants at the Annual Meeting on Phonology 2020 for very helpful discussion and feedback on this project. I am also thankful to the two Glossa reviewers for carefully reading the paper and helping me improve its quality. Finally, I am very grateful to the students at the Université de Lausanne who helped me recruit Swiss participants, in particular Naguy Belkhir, Catherine Maas, Arbelinda Mazreku, Letizia Monti, and Marta Reguero.

Competing Interests

The author has no competing interests to declare.

References

Barr, Dale J. & Levy, Roger & Scheepers, Christoph & Tily, Harry J. 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language 68(3). 255–278. DOI:  http://doi.org/10.1016/j.jml.2012.11.001

Bayles, Andrew & Kaplan, Aaron & Kaplan, Abby. 2016. Inter-and intra-speaker variation in French schwa. Glossa: a journal of general linguistics 1(1). DOI:  http://doi.org/10.5334/gjgl.54

Boersma, Paul & Hayes, Bruce. 2001. Empirical tests of the gradual learning algorithm. Linguistic Inquiry 32. 45–86. DOI:  http://doi.org/10.1162/002438901554586

Bürki, Audrey & Ernestus, Mirjam & Gendrot, Cédric & Fougeron, Cécile & Frauenfelder, Ulrich Hans. 2011. What affects the presence versus absence of schwa and its duration: A corpus analysis of French connected speech. The Journal of the Acoustical Society of America 130(6). 3980–3991. DOI:  http://doi.org/10.1121/1.3658386

Bürkner, Paul-Christian. 2017. brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software 80(1). 1–28. DOI:  http://doi.org/10.18637/jss.v080.i01

Bürkner, Paul-Christian & Vuorre, Matti. 2019. Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science 2(1). 77–101. DOI:  http://doi.org/10.1177/2515245918823199

Carpenter, Bob & Gelman, Andrew & Hoffman, Matthew D & Lee, Daniel & Goodrich, Ben & Betancourt, Michael & Brubaker, Marcus & Guo, Jiqiang & Li, Peter & Riddell, Allen. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76(1). DOI:  http://doi.org/10.3102/1076998615606113

Clements, George N. 1990. The role of the sonority cycle in core syllabification. Papers in laboratory phonology 1. 283–333. DOI:  http://doi.org/10.1017/CBO9780511627736.017

Côté, Marie-Hélène. 2001. Consonant cluster phonotactics: a perceptual approach. Cambridge, MA: MIT dissertation. DOI:  http://doi.org/10.7282/T3HD7TGR

Côté, Marie-Hélène. 2007. Rhythmic constraints on the distribution of schwa in French. In Camacho, José & Flores-Ferrán, Nydia & Sánchez, Liliana & Déprez, Viviane & Cabrera, María José (eds.), Romance Linguistics 2006, 79–92. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/cilt.287.07cot

Côté, Marie-Hélène. 2012. Laurentian French (Quebec). Extra vowels, missing schwas and surprising liaison consonants. In Gess, Randall & Lyche, Chantal & Meisenburg, Trudel (eds.), Phonological variation in French: Illustrations from three continents, 235–274. Amsterdam/Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/silv.11.13cot

Dell, François. 1976. Schwa précédé d’un groupe obstruante-liquide. Recherches linguistiques Saint-Denis 4. 75–111.

Dell, François. 1977. Paramètres syntaxiques et phonologiques qui favorisent l’épenthèse de schwa en français moderne. In Rohrer, Christian (ed.), Actes du colloque franco-allemand de linguistique théorique, 141–153. Tübingen: Niemeyer. DOI:  http://doi.org/10.1515/9783111681351-008

Dell, François. 1978. Certains corrélats de la distinction entre morphologie dérivationnelle et morphologie flexionnelle dans la phonologie du français. Etudes linguistiques sur les langues romanes. Montreal working papers in Linguistics 10. 1–10.

Dell, François. 1985. Les règles et les sons. Paris: Hermann 2nd edn.

Durand, Jacques & Laks, Bernard. 2000. Relire les phonologues du français: Maurice Grammont et la loi des trois consonnes. Langue française 126. 29–38. DOI:  http://doi.org/10.3406/lfr.2000.4670

Durand, Jacques & Laks, Bernard & Lyche, Chantal. 2009. Le projet PFC (phonologie du français contemporain): une source de données primaires structurées. In Durand, Jacques & Laks, Bernard & Lyche, Chantal (eds.), Phonologie, variation et accents du français, 19–61. Paris: Hermès. https://halshs.archives-ouvertes.fr/halshs-00551002.

Eychenne, Julien. 2019. On the deletion of word-final schwa in Southern French. Phonology 36(3). 355–389. DOI:  http://doi.org/10.1017/S0952675719000198

Flemming, Edward. 2002. Auditory representations in phonology. New York: Routledge. DOI:  http://doi.org/10.4324/9781315054803

Franke, Michael & Roettger, Timo B. 2019. Bayesian regression modeling (for factorial designs): A tutorial. DOI:  http://doi.org/10.31234/osf.io/cdxv3

Gelman, Andrew & Carlin, John B & Stern, Hal S & Dunson, David B & Vehtari, Aki & Rubin, Donald B. 2013. Bayesian data analysis. New York: Chapman and Hall/CRC. DOI:  http://doi.org/10.1201/b16018

Gess, Randall & Lyche, Chantal & Meisenburg, Trudel. 2012. Introduction to phonological variation in French: Illustrations from three continents, 1–19. Amsterdam/Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/silv.11.01ges

Goldwater, Sharon & Johnson, Mark. 2003. Learning OT constraint rankings using a maximum entropy model. In Spenader, Jennifer & Eriksson, Anders & Dahl, Östen (eds.), Proceedings of the Stockholm Workshop on Variation within Optimality Theory, 111–120. Stockholm: Stockholm University, Department of Linguistics.

Grammont, Maurice. 1914. Traité pratique de prononciation française. Paris: Delagrave.

Hambye, Philippe & Simon, Anne Catherine. 2012. The variation of pronunciation in Belgian French. In Gess, Randall & Lyche, Chantal & Meisenburg, Trudel (eds.), Phonological variation in French: Illustrations from three continents, 129–149. Amsterdam/Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/silv.11.08ham

Hansen, Anita Berit. 2012. A study of young Parisian speech: Some trends in pronunciation. In Gess, Randall & Lyche, Chantal & Meisenburg, Trudel (eds.), Phonological variation in French: Illustrations from three continents, 151–172. Amsterdam/Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/silv.11.09han

Haspelmath, Martin & Sims, Andrea. 2010. Understanding morphology. London: Routledge. DOI:  http://doi.org/10.4324/9780203776506

Hay, Jennifer. 2003. Causes and consequences of word structure. Psychology Press. DOI:  http://doi.org/10.4324/9780203495131

Hayes, Bruce & Wilson, Colin. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39. 379–440. DOI:  http://doi.org/10.1162/ling.2008.39.3.379

Jun, Jongho. 2004. Place assimilation. In Hayes, Bruce & Kirchner, Robert & Steriade, Donca (eds.), Phonetically based phonology, 58–86. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511486401.003

Kruschke, John K. 2015. Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press 2nd edn. DOI:  http://doi.org/10.1016/B978-0-12-405888-0.00001-5

Léon, Pierre R. 1971. Essais de phonostylistique, vol. 4. Montréal: Didier.

Liddell, Torrin M., & Kruschke, John K. 2018. Analyzing ordinal data with metric models: What could possibly go wrong? Journal of Experimental Social Psychology 79. 328–348. DOI:  http://doi.org/10.1016/j.jesp.2018.08.009

LimeSurvey. 2012. LimeSurvey: An open source survey tool. http://www.limesurvey.org.

Malécot, André. 1976. The effect of linguistic and paralinguistic variables on the elision of the French mute-e. Phonetica 33(2). 93–112. DOI:  http://doi.org/10.1159/000259716

Myers, James. 2017. Acceptability judgments. In Oxford Research Encyclopedia of Linguistics, Oxford University Press. DOI:  http://doi.org/10.1093/acrefore/9780199384655.013.333

New, Boris & Brysbaert, Marc & Veronis, Jean & Pallier, Christophe. 2007. The use of film subtitles to estimate word frequencies. Applied Psycholinguistics 28. 661–677. DOI:  http://doi.org/10.1017/S014271640707035X

Ohala, John J. 1990. The phonetics and phonology of aspects of assimilation. Papers in laboratory phonology 1. 258–275. DOI:  http://doi.org/10.1017/CBO9780511627736.014

Ohala, John J. 1992. Alternatives to the sonority hierarchy for explaining segmental sequential constraints. In Ziolkowski, Michael & Noske, Manuela & Deaton, Karen (eds.), The parasession on the syllable in phonetics & phonology, 319–338. Chicago, IL: Chicago Linguistic Society.

Pater, Joe. 2007. The locus of exceptionality: Morpheme-specific phonology as constraint indexation. In Bateman, Leah & O’Keefe, Michael & Reilly, Ehren & Werle, Adam (eds.), University of massachusetts occasional papers in linguistics 32: Papers in optimality theory iii, Amherst, MA: GLSA. DOI:  http://doi.org/10.7282/T38C9TB6

Plummer, Martyn. 2016. rjags: Bayesian graphical models using MCMC. https://CRAN.Rproject.org/package=rjags. R package version 4–5.

R Core Team. 2020. R: A language and environment for statistical computing. R Foundation for Statistical Computing Vienna, Austria. https://www.R-project.org/.

Racine, Isabelle. 2007. Effacement du schwa dans des mots lexicaux: constitution d’une base de données et analyse comparative. Proceedings of JEL, 125–130.

Racine, Isabelle. 2008. Les effets de l’effacement du schwa sur la production et la perception de la parole en français: University of Geneva dissertation. DOI:  http://doi.org/10.13097/archiveouverte/unige:602

Racine, Isabelle & Andreassen, Helene. 2012. A phonological study of a Swiss French variety: data from the canton of Neuchâtel. In Gess, Randall & Lyche, Chantal & Meisenburg, Trudel (eds.), Phonological variation in French: Illustrations from three continents, 211–233. Amsterdam/Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/silv.11.10rac

Racine, Isabelle & Durand, Jacques & Andreassen, Helene. 2016. PFC, codages et représentations: la question du schwa. Corpus 15. DOI:  http://doi.org/10.4000/corpus.3014

Racine, Isabelle & Grosjean, François. 2002. La production du e caduc facultatif est-elle prévisible? Un début de réponse. Journal of French Language Studies 12(3). 307–326. DOI:  http://doi.org/10.1017/S0959269502000340

Scheer, Tobias. 1999. Aspects de l’alternance schwa-zéro à la lumière de “CVCV”. Recherches linguistiques de Vincennes (28). 87–114. DOI:  http://doi.org/10.4000/rlv.1215

Schütze, Carson T. 2016. The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Berlin: Language Science Press. DOI:  http://doi.org/10.26530/OAPEN_603356

Schütze, Carson T. & Sprouse, Jon. 2013. Judgment data. In Podesva, Robert J. & Sharma, Devyani (eds.), Research methods in linguistics, 27–50. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9781139013734.004

Smith, Brian W & Pater, Joe. 2020. French schwa and gradient cumulativity. Glossa: a journal of general linguistics 5(1). DOI:  http://doi.org/10.5334/gjgl.583

Smith, Jennifer L & Moreton, Elliott. 2012. Sonority variation in Stochastic Optimality Theory: Implications for markedness hierarchies. In Parker, Steve (ed.), The sonority controversy, 167–194. De Gruyter Mouton. DOI:  http://doi.org/10.1515/9783110261523

Smolensky, Paul & Legendre, Géraldine. 2006. The harmonic mind: From neural computation to optimality-theoretic grammar. Cambridge, MA: MIT Press.

Steriade, Donca. 1997. Phonetics in phonology: the case of laryngeal neutralization. Manuscript.

Storme, Benjamin. 2019. Contrast enhancement as motivation for closed syllable laxing and open syllable tensing. Phonology 36. 303–340. DOI:  http://doi.org/10.1017/S0952675719000149

Storme, Benjamin. 2021. Against the law of three consonants in French: Evidence from judgment data. In Bennett, Ryan & Bibbs, Richard & Brinkerhoff, Mykel Loren & Kaplan, Max J. & Rich, Stephanie & Rysling, Amanda & Handel, Nicholas Van & Cavallaro, Maya Wax (eds.), Proceedings of the 2020 Annual Meeting on Phonology. DOI:  http://doi.org/10.3765/amp.v9i0.4892

Vaissière, Jacqueline. 2002. Cross-linguistic prosodic transcription: French vs. English. In Volskaya, Nina & Svetozarova, Natalia & Skrelin, Pavel (eds.), Problems and methods of experimental phonetics. In honour of the 70th anniversary of Pr. L. V. Bondarko, 147–164. Saint Petersburg: Saint Petersburg State University Press. https://halshs.archives-ouvertes.fr/halshs-00316156.

Wright, Richard. 2004. A review of perceptual cues and cue robustness. In Hayes, Bruce & Kirchner, Robert & Steriade, Donca (eds.), Phonetically based phonology, 34–57. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511486401.002

Zuraw, Kie & Hayes, Bruce. 2017. Intersecting constraint families: an argument for Harmonic Grammar. Language 93(3). 497–548. DOI:  http://doi.org/10.1353/lan.2017.0035