Broad interest in probabilistic aspects of language has reignited debates about a potential delineation between the shape of an abstract grammar and patterns of language in use. A central topic in this debate is the relationship between measures capturing aspects of language use, such as word frequency, and patterns of variation. While it has become common practice to attend to frequency measures in studies of linguistic variation, fundamental questions about exactly what linguistic unit’s frequency it is appropriate to measure in each case, and what this implies about the representations or processing mechanisms at play, remain underexplored. In the present study, we compare how three frequency measures account for variance in Coronal Stop Deletion (CSD) based on large-scale corpus data from Philadelphia English: whole-word frequency, stem frequency, and conditional (whole-word/stem) frequency. While there is an effect of all three measures on CSD outcomes in monomorphemes, the effect of conditional frequency is by far the most robust. Furthermore, only conditional frequency has an effect on CSD rates in -

Systematic, intraspeaker linguistic variation has long played a central role in understanding the architecture of the grammar and its relationship to everyday language use. Early variationist sociolinguistic work posited that generative phonological rules (as in, e.g.,

At the same time, this increasing interest in modeling probabilistic aspects of language has renewed longstanding debates about the dividing line, if there is one, between grammar and language use. One of the core empirical phenomena that has fueled these debates is the sensitivity of variation to the frequency with which linguistic elements are used. In everyday experience, language users encounter different sounds, affixes, words, and syntactic structures at different rates. Highly frequent words, for example, may be used and heard many times every day, while infrequent words occur far more rarely. Unsurprisingly, it seems that these dramatic differences in usage frequency may influence many aspects of how people perceive and produce language. For example, generally speaking, the more frequent a word is, the more quickly it will be recognized (

In light of these theoretical consequences of frequency-type effects, it becomes important to address fundamental questions about what frequency

In many psycholinguistic models, for example, the unit of lexical representation is the lemma, which contains the syntactic properties of a word and is shared by morphological relatives with the same root (

In sociolinguistic research, it is more common to see the effect of W

Another possibility is that frequency effects on linguistic variation are actually driven by the predictability of words. Higher frequency words are more predictable, and therefore may be subject to greater compression and reduction (

This paper is an investigation into how these three frequency measures (stem frequency, whole-word frequency, and conditional frequency) relate to a well-studied linguistic variable in English, Coronal Stop Deletion (CSD). CSD is a variable process of consonant cluster reduction, probabilistically conditioned by factors such as rate of speech, phonological context, and morphological structure, that has also been shown to be associated with word frequency (typically whole-word frequency has been used). CSD provides a particularly fertile territory in which to explore these frequency measures (and the mechanisms of perception/production they’re related to) because -

While weighing our three frequency measures’ relative ability to capture variance in the CSD data will allow us to offer some practical methodological suggestions, the more important point we will make is that the different frequency estimates are

Relationship between frequency measures for monomorphemic (left) and

We argue that these results should lead us to understand the different frequency measures as different in kind, capturing different mechanisms that may affect linguistic variation. More specifically, if we adopt the theoretical interpretations that we have already briefly suggested, in which stem frequency approximates variable ease of lexical access, whole word frequency captures the long-term accumulation of reduction in the word form, and conditional frequency is a proxy for the variable predictability of word forms, our finding of a strong and consistent influence of conditional frequency points to an important role for predictability in CSD. However, this interpretation also suggests that there is no one simple effect of “word frequency” that can be expected to have a uniform influence on different phenomena; in other words, our results should not be interpreted as showing that conditional frequency is the “correct” frequency measure to use in the study of variation across the board. Rather, we conclude that the question of how different frequency measures relate to any given phenomenon is an empirical one: different variable phenomena may turn out to be more or less sensitive to the different mechanisms that these measures tap into. As a methodological issue, then, the selection of a frequency measure to use in quantitative analysis ought to be a considered one. More interestingly, though, empirical evidence about what kind of frequency is most closely associated with the use of linguistic variable phenomena of various types can be brought to bear on the question of what factors most heavily influence the production of those variables. This can, in turn, enrich our understanding of the place of variation in grammatical systems and the interaction between grammar and use.

CSD is, in simplest terms, the variable deletion of word-final coronal stops following consonants in English (e.g.,

Researchers have posited a range of competing explanations, both formal and functional, for the morphological conditioning of CSD (see, for example,

All accounts of the grammatical conditioning of CSD are complicated by other morphological categories whose rates of CSD are intermediate between monomorphemes and regular past forms. These include semiweak verbs where past tense is marked both with a vowel change and a coronal stop (e.g.

While CSD is typically described in terms of a discrete distinction between ‘present’ and ‘absent’ realizations, this perspective has been called into question. Temple (

This possible interaction between frequency and morphological structure, and its implications for theories of frequency and variation in form, is a central topic that we probe in this paper. In the following two subsections, we review first the literature on how frequency influences the phonetics and phonology of word forms, and second the literature on how frequency relates to morphological representation and processing.

Frequency, specifically whole-word frequency, is associated with variation in phonetic and phonological form in many cases. In general, frequent whole-words tend to be pronounced faster, and in more lenited or reduced forms, than infrequent whole-words. This is relevant insofar as we conceive of CSD as an example of lenition, and we generally expect phonetic reduction and lenition to be intimately related to duration (

Beyond gradient phonetic properties like duration, there exist a number of variables where the apparent rate of discrete variants

Outlying results notwithstanding, it seems generally true that frequent words are more susceptible to compression and ‘weakening’ of their pronunciations. Explanations for this kind of reduction phenomenon fall into three main theoretical camps (

We now turn to a brief examination of the relationship between frequency and morphological structure, with reference to both sociolinguistic and psycholinguistic results that highlight possible frequency–morphology interactions. As we have mentioned, there is already some reason to believe that frequency and morphological structure interact in how they condition CSD itself. Myers & Guy (

The relevance of morphological structure for word processing has led to more widely recognized interactions in this domain. There is some evidence that morphologically complex words are generally recognised faster than monomorphemic words of equal length and frequency (

In addition to basic frequency/morphology interactions in behavioral reaction times, there is also a growing body of work making inferences about what level of representation is active at a given point in the timecourse of spoken word recognition based on what kind of frequency measure correlates best with neural activity during processing. Specifically, a number of MEG studies find neurological activity to be most strongly correlated with measures of morphological structure, including lemma frequency and the transition probability between stem and suffix, at around 170ms (

Among the studies that do apply this strategy of comparing frequency measures to explore the role of morphological structure in production, one interesting result that has emerged is evidence of ‘paradigmatic enhancement’ effects. As well as the basic effect whereby frequent items are realized (and recognized) faster as a result of their predictability or ease of retrieval, some words with a high frequency compared to morphologically related words within the same paradigm are reinforced and pronounced with

For this paper, data are taken from the Philadelphia Neighborhood Corpus of LING560 Studies (PNC) (

CSD outcomes were hand-coded according to auditory and spectrographic cues. A Praat script

In concrete terms, the goal of this study is to evaluate how different frequency-related measures may be associated with variable CSD. In particular, we are interested in whether it is the frequency of the whole-word, the frequency of some smaller constituent, or indeed the frequency relationship between the whole-word and its component parts, that best predicts CSD outcomes. To that end, we compare how well three different measures, calculated from values in the SUBTLEX_{US} Corpus, account for variance in the CSD variable. These three measures, which we introduced briefly in §1, do not exhaust all possible relationships between the frequency of different strings or units and CSD, but they do capture several distinct perspectives on how frequency measures might be relevant to the variable at hand.

Our first such measure, whole-word frequency, is extracted from the FREQlow values in SUBTLEX_{US}: the raw number of times that a word appeared in the corpus in lower case. This measure, or a similar one, is the most widely used in linguistics, but it has some quirks. For example, in SUBTLEX_{US}, as in other corpora, frequency norms are calculated according to orthographic strings. This means that homographs have the same FREQlow value whether or not they are phonologically or morphologically related. However, whole-word frequency basically approximates the frequency of a surface phonological form. This measure was natural log-transformed and centred with the mean at zero.

We call our second measure stem frequency.

The third measure is conditional frequency. Conditional frequency is computed from the other two measures; the whole-word frequency is divided by the stem frequency. Quantitatively speaking, conditional frequency is a proportion, bounded by 0 and 1. In other words, conditional frequency approximates the frequency of a particular word among its morphological relatives.

The primary methodology used in this paper is comparison of mixed effects logistic regression models using the

A central goal of this article is to compare multiple measures which are not only arithmetically related, but also attempt to capture similar (if not identical) aspects of how words are represented and processed. Therefore, before assessing the relative contributions of each of these frequency measures on CSD outcomes, we must explore the relationship between them. _{US}.

As is evident from

Monomorphemes and regular past forms differ in particular in their conditional frequency distributions. While monomorphemes are distributed fairly evenly, the majority of regular past forms have a very low conditional frequency. Reflecting on the properties of these word types, this might not be entirely unexpected. By definition, regular past forms are verbal, and implicate a whole paradigm of differently-inflected verb forms whose whole-word frequencies contribute to the stem frequency value. As a result, the regular past form often makes up only a small part of the stem frequency. On the other hand, the monomorphemic class includes words from a number of parts of speech that differ in the types of morphological relatives that occur.

The investigation of correlations between the different frequency measures gives us confidence that it is reasonable to include both conditional frequency and stem frequency as predictors in a single model. Conversely, we should be wary of multicollinearity effects in models with other pairs of frequency measures. For the sake of completeness, we include all possible combinations of frequency measures in our model comparison analysis, but note that some improvements to model fit are likely to be artifacts of the relationship between measures.

In order to probe which frequency measure best captures variance in CSD, we compared a series of logistic regression models predicting CSD outcomes. The baseline model does not contain any frequency measures but does include the fixed effects for speech rate, grammatical class, and following segmental context, plus a random intercept for speaker. The subsequent models add all possible combinations of the three frequency measures to this baseline model. We use likelihood ratio tests to assess whether each additional level of complexity (i.e. each additional frequency measure) was warranted as a significant improvement over the nested smaller models.

In addition to likelihood ratio tests, each model’s fixed AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion) and log-likelihood statistics were recorded. While the log-likelihood is inevitably improved by adding additional complexity to a model, the AIC and BIC penalize model complexity at the same time as evaluating a model’s ability to account for variance. This is especially true of the BIC, whose penalty for additional complexity is proportional to the number of observations, and frequently disagrees with the AIC in favour of a simpler model. Together, these information criteria provide the clearest evaluation of these models, indicating in particular where multiple frequency measures do not account for a sufficient amount of variance to justify their inclusion.

Information criteria reduction from baseline comparing models of full dataset (triangles = most reduced).

Including each of the three frequency measures, individually, yields information criteria statistics that are somewhat reduced compared to the baseline model. This result is reinforced by significant likelihood ratio tests (p < .001) in each case. However, the reduction in both AIC and BIC that is attained from the addition of conditional frequency far outstrips that of the other measures. In fact, the addition of conditional frequency provides a large reductions in both AIC and BIC regardless of any other frequency measures already included in a model. The model comparison also suggests that the combination of stem frequency and whole-word frequency in a single model is a significant improvement over just one of these measures. However, we cannot rule out the possibility that this is an artifact of the strong correlation between these measures causing inflation of their estimated effects. In addition, neither stem nor whole-word frequency significantly improves any model that already includes an effect of conditional frequency. This is demonstrated by likelihood ratio tests (p > .05), and the fact that these measures do not account for enough additional variance to counteract the penalty for model complexity that occurs in either the AIC or the BIC.

The initial model comparison results point to a need to reconsider how frequency is accounted for in linguistic variation. In particular, the success of conditional frequency over other measures in terms of accounting for variance suggests that the interplay between word frequency and morphological structure within the lexicon is important and underexplored. Morphological structure is particularly relevant for a variable like coronal stop deletion, since it has repeatedly been reported that coronal stops at the end of monomorphemes are more likely to be deleted than coronal stops that constitute -

Observed CSD outcomes according to each frequency measure and morphological class.

Compared to the results for whole-word and stem frequency, the results for conditional frequency are striking. Here, not only is there an effect for both monomorphemes and regular past forms, but the lines are almost parallel. This helps to explain why conditional frequency was so highly favored by the model comparison for the combined data in

The differences in the effects of the frequency measures between morphological categories is not captured by the regression models we have been discussing, because they do not include any interaction terms targeting the non-independence of frequency and grammatical category. As a result, the best models we’ve presented so far, which combine data from regular past and monomorphemic words (sum-coded), will compromise between the two. In other words, a frequency measure that might be best for one group of words will be penalized if it is inappropriate for another. This raises questions about the performance of frequency measures within morphological categories, which are not addressed by the models we have presented so far. Therefore, in the following subsections we divide the data by morphological class and test the different frequency predictors within each word type.

We begin by adopting the same method of model comparison as described for the full dataset, implemented over a subset of the data containing only monomorphemes. Once again, all models include fixed effects for speech rate and following segmental context, and a random intercept for speaker, but since all the words are monomorphemic no morphological category predictor is included.

In

Information criteria reduction from baseline comparing models of monomorpheme subset (triangles = most reduced).

In terms of the models that best reduce the information criteria, the results for the monomorpheme models are slightly less straightforward than for the full dataset in that the AIC and BIC disagree. Once again, the BIC is lowest for the model with just conditional frequency in addition to the baseline effects. However, the AIC is lower in the models containing at least one other frequency measure in addition to conditional frequency, and lowest in the model with all three measures. This suggests the other measures do capture enough variance in monomorphemes to outperform the relatively small penalty for additional model complexity that is applied in the computation of AIC. This seems especially true for stem frequency, which significantly improves the fit of every model it is added to according to likelihood ratio tests (p < .05). This includes all models with conditional frequency and/or whole-word frequency already present. In contrast, likelihood ratio tests do not show whole-word frequency to significantly improve models with conditional frequency already present. This is likely due, in large part, to the complete absence of a correlation between conditional and stem frequency for monomorphemes, such that they do not compete to account for the same variance.

Just like for monomorphemes, we conducted the same method of model comparison for the regular past forms alone. Again, all models include a fixed effect of speech rate and following segmental context, a random intercepts for speaker. According to

Information criteria reduction from baseline comparing models of complex form subset (triangles = most reduced).

Unsurprisingly, conditional frequency once again introduces a large reduction in both the AIC and BIC of every model it is added to, as well as a significant improvement in terms of likelihood ratio tests (p < .001). Unlike for the full and monomorpheme datasets, not all of the frequency measures improve the baseline model when they are added individually. The addition of whole-word frequency does not account for enough variance to overcome the penalty for model complexity in either the AIC or BIC, and does not significantly improve model fit according to a likelihood ratio test (p>.1). Stem frequency, on the other hand, does marginally reduce the AIC and significantly improve model fit according to a likelihood ratio test (p < .05), but the magnitude of its improvement is still less than the penalty applied by the BIC for introducing additional complexity to the baseline model. Once again, the combination of both whole-word and stem frequency apparently reduces both the AIC and BIC by a fair amount compared to the baseline model. Even though the correlation between whole-word and stem frequency is weaker for regular past forms than for monomorphemes, it is still strong enough that this effect is likely to be an artifact of multicollinearity, especially given how poorly both whole-word and stem frequency perform individually.

Like for the monomorpheme models, the AIC and BIC disagree as to the optimal model for regular past forms. For the third time, the model with conditional frequency alone is favored by the BIC, and additional frequency measures are penalized for unnecessary complexity. However, this time, the AIC is minimized in the model with both conditional and whole-word frequency. This is despite the fact that whole-word frequency performed poorest when it was added to the baseline model individually,

As we have discussed at some length in Section 2, the frequency of a word—or other linguistic unit—is associated with differences in the way it is perceived, produced, or even represented in the mind of the perceiver or producer. As such, even when it is not a study’s primary concern, contemporary studies in various subfields of linguistics take steps to control for some relevant measure of frequency. However, several possible such measures are available, and their different properties are relatively underexplored. Moreover, the complex interplay between the frequency of different sub-lexical units and the morphological structure of words is rarely considered, especially within the quantitative analysis of linguistic variables like CSD.

With respect to these questions, there are two clear results to take away from §4, which this section will discuss in turn. First, whole-word frequency (and to a lesser extent, stem frequency) is a significant predictor of CSD in monomorphemes but not in regular past tense forms. The direction of the effect within monomorphemes is as expected for reduction phenomena in general, with more CSD in higher-frequency whole-words. Second, both monomorphemes and past tense forms are highly sensitive to conditional frequency, again in the direction of more CSD with higher conditional frequency. Conditional frequency, therefore, has both a stronger and more pervasive across-the-board effect on CSD than the more familiar whole-word frequency measure. In the following subsections, we discuss these two results in light of their theoretical implications.

Whole-word frequency and stem (or ‘base’ or ‘lemma’) frequency are the measures of frequency most commonly incorporated into studies in contemporary sociolinguistics and psycholinguistics. For our data, it turns out that these measures are very highly correlated, and correspondingly predict extremely similar patterns of CSD across different subsets of the data. On the assumption that these frequency measures would also correlate this strongly throughout the lexicon (not just for our subset of CSD words), we offer the methodological recommendation that whole-word frequency, which is considerably more straightforward to implement than stem frequency, will be at least as effective as stem frequency for capturing frequency-related variance in other linguistic variables. In other words, for researchers who simply want to incorporate a reasonable frequency control into studies that are primarily aimed at investigating other phenomena, it will not be worth the effort to operationalize a stem frequency measure. With regard to the specific pattern found using these measures, we observe a main effect of whole-word and stem frequency on CSD outcomes for the monomorphemes—coronal stops are more likely to be deleted at the end of frequent monomorphemes than infrequent monomorphemes—but not for regular past forms. An equivalent interaction between morphological category and whole-word frequency has also been reported for other CSD datasets (

A potential avenue for explanation comes from Erker & Guy (

On the other hand, a deficiency of the amplification story is that, at least for CSD, grammatical categories are treated more or less like arbitrary labels for words. In reality, monomorphemes and regular past forms differ in terms of morphological complexity, which may explain what we observe in terms of sensitivity (or lack thereof) to measures of frequency. Morphological complexity has two relevant properties as pertains to frequency. The first is that of informativity: while coronal stops at the end of monomorphemes are often highly predictable and contain no additional disambiguating information about the word, coronal stops at the end of regular past forms constitute a suffix that marks past tense. Moreover, when this suffix is deleted, regular past forms are always homophonous with a present or infinitival form of the verb. These are some of the primary concerns of linguists who ascribe a ‘functional’ motivation to grammatical patterns of CSD, arguing that deletion is avoided in cases where it would eliminate important past tense information (e.g.

The second relevant property of morphological complexity is that it entails pieces (whether independently-represented or emergent from shared phonology and semantics) being shared across words. That is, not only does CSD target an informative suffix when it applies in regular past forms, it targets the

What we have called ‘conditional frequency’ is the proportion of instances of a stem that are realized as a certain whole-word. Unlike for whole-word and stem frequency, we find strong effects of conditional frequency on predicting CSD outcomes in all of our regression models. For regular past forms, conditional frequency corresponds to the decontextualized probability of the -

While -

Our results, that high conditional frequency corresponds to a high rate of coronal stop deletion, conflict with some recent findings of ‘paradigmatic enhancement’ effects. This is the class of results where the most common reflexes of a particular word or morpheme are found to be phonetically reinforced rather than reduced. These effects are framed from both speaker-oriented and passive perspectives. They are commonly interpreted in terms of speakers articulating common reflexes of a morpheme with increased confidence, suggesting an on-line pressure to reduce in cases where the speaker is unconfident. At the same time, speaker confidence itself has been explained as the result of extensive motor practice, allowing these words to be executed with enhanced kinematic skill (

In the case of -

Whether researchers are directly exploring frequency effects or are trying to control for frequency in their statistical model, we must consider which frequency measure to use, how that measure relates to the purported mechanism of its influence, and how it may interact with morphological or other linguistic structure. While whole-word and stem frequency are the most commonly used frequency measures, our model comparisons showed that conditional frequency is a strong predictor of CSD. This result suggests that greater consideration should be given to conditional frequency as a predictor in the study of phonological variation. From a purely methodological perspective, it far outperforms whole-word and stem frequency in terms of accounting for variance in the dataset as a whole as well as in subsets restricted to words of just one grammatical category. Moreover, we see that conditional frequency is, at most, relatively weakly correlated with the other measures in this study (

But the value of giving greater attention to conditional frequency is not purely methodological. We have interpreted conditional probability in terms of the predictability of either an -

In asking how frequency is measured, we are not concerned with comparing different frequency estimates for a single linguistic unit, but rather

While they are categorized in discrete terms, for many of these variables the question of whether they arise in the phonetics or phonology is not settled.

Code available at

This is the usual decision for CSD studies on American English. It has recently been suggested that British English glottal replacement of /t/ blocks CSD (

For example, quasi-gemination across word boundaries makes it very difficult to distinguish between

The true monomorphemes do not noticeably differ from a more traditional ‘monomorphemic’ category in terms of their sensitivity to different frequency measures.

The ‘regular past’ category includes all preterite, perfect, and passive forms featuring an -

Similar measures to our stem frequency measure have been called lemma frequency in previous literature. However, lemma frequency typically only includes inflectionally related words that share a stem. Since we count both inflectionally and derivationally related words that share a stem, we opted for a different name.

The existing literature on CSD gives us no reason to expect that part of speech is a important dimension for the variable.

The full results of model comparison can be found in Appendix A.

Interactions of morphological class with whole-word frequency and stem frequency are fairly significant when they are added to models, but they are always heavily penalized in model comparison.

The additional file for this article can be found as follows:

[[Description]]. DOI:

The authors have no competing interests to declare.