1 Introduction

The phonetic properties of rhotics exist within a spectrum of interrelated articulatory gestures, but without one common unifying characteristic that distinguishes them from other sounds. Therefore, it is not possible to universally classify them based on surface features while also excluding sounds that do not demonstrate rhotic-like phonological behavior (see Section 2). A more promising approach has been to treat rhotics based on combinations of articulatory gestures (Magnuson 2007; Proctor 2011; Sebregts 2014) or their phonological behavior (e.g. Wiese 2001a; b; 2011; Chabot 2019; Youssef 2019; Natvig & Salmons forthcoming). For example, Proctor (2011) shows evidence that rhotics, and liquids generally, instantiate a range of dorsal and coronal articulations. With respect to phonology, however, Chabot (2019) argues for a rhotic definition based on behavior, not features, because their surface forms are often irrelevant to their phonological patterning. This finding is further supported in Youssef’s (2019) analysis of r-sounds’ contrasts, distributions, and phonological behavior across Arabic dialects. In fact, Chabot states that “there may be no way of uniting rhotics via representational models; that is, they cannot be understood outside of the role they play within a system” (2019: 3), leading to the conclusion that “if rhotics are in an arbitrary phonetics/phonology relationship, such a relationship must in principle be possible for all phonology” (Chabot 2019: 18). The question, then, is whether the degree of arbitrariness that rhotics exhibit is attested for other phonemic classes, or whether there are structural motivations for this arbitrary relationship between phonetics and phonology.

Phonetic features for a particular phoneme may be overspecified relative to their corresponding phonological representations (Purnell 2009). However, there are processes that reflect clear correspondences between phonological representations and phonetic properties (e.g. Halle et al. 2000). Laryngeal assimilations are one example: These features are shared within the domains of the onset, nucleus, or coda of a syllable (Kehrein & Golston 2004). When adjacent phonemes have conflicting features, i.e. voiced and voiceless or fortis (aspirated) and lenis (plain voiceless), one will take on the feature of the other. The specific feature (voicing or aspiration) that spreads depends on how a language marks the contrasting phonemic classes. For example, Dutch differs from the general Germanic pattern by the active spreading of voicing instead of aspiration (Iverson & Salmons 1995; 2003). Fortis stops in Norwegian, for example, trigger both sonorant devoicing and regressive assimilation (Kristoffersen 2000: 77; Allen 2016: 106ff).1 Adjectives that modify neuter nouns are marked by a -t suffix and induce regressive assimilation, as in trygg [1thɾ̥yɡ] ‘safe’ vs trygt [1thɾ̥ykht] ‘safe (NEUT.SG)’.2 Here fortis /th/, both as the onset consonant and the neuter agreement marker, devoices — or spreads its aspiration — to the following /r/ and to the preceding [ɡ], respectively.3 In Dutch, on the other hand, the compounding of meet ‘measure’ and band, resulting in mee[db]and ‘tape-measure’ signals voicing activity, where the laryngeal features of /b/ spread to the preceding voiceless /t/ (Iverson & Salmons 2003: 12). The Norwegian and Dutch examples demonstrate two types of laryngeal systems, aspirating and voicing respectively, in the laryngeal realism framework (Iverson & Salmons 1995; Honeybone 2005). The active assimilatory features are encoded phonologically and contrast with an unmarked category. Namely, the Norwegian fortis series is specified for aspiration features ([spread glottis]) and the lenis series lacks laryngeal features. The configuration for Dutch, then, is reversed. The voiceless stops are unmarked, whereas the voiced ones are specified for voice, or [slack]. From a laryngeal realist perspective, these alternations have a non-arbitrary relationship between the phonetics and the phonology: The phonologically active categories are aspirated in Norwegian and voiced in Dutch.

Although these phonetic outputs align with phonological correlates, some do not. Swedish, for example, has a fully voiced lenis series, but assimilation patterns similar to Norwegian, where aspiration — not voicing — is actively spread from fortis consonants (Riad 2014: 49). Even though the surface forms of Swedish consonants are exponents of aspirating and voicing systems, its phonological behavior indicates that it has a fortis/lenis phonological configuration. Because there is no evidence of voicing activity in Swedish (Riad 2014: 49), the full voicing of its lenis stops is not the result of a phonological feature. It is a phonetic property of that natural class that, in this case, enhances its contrast against the fortis set (Iverson & Salmons 2003; Section 3.3). Phonetics does not determine a particular phonological system. Rather, the patterns of phonetic behavior influence the construction of phonologically active feature representations. The composition of these representations reflects what is and what is not arbitrary in any given language. How a system marks contrast results in the substantive and arbitrary categories in the phonology (Avery & Rice 1989; Rice 1999; 2009).

In contrast to laryngeal realism, the laryngeal relativist position argues against a direct relationship between a phonological feature and assimilatory voicing patterns (Cyran 2014). A critique of laryngeal realism is that phonological representations are too strongly biased toward phonetic features: “the presence of full voicing is taken to be an indication of the presence of the element [voice] […] while aspiration leads to the postulation of [spread]” (Cyran 2014: 10). Furthermore, Cyran (2014: 14) argues against privativity, stating that “privative models, when confronted with symmetrical assimilation, must distinguish between a phenomenon which is the result of spreading of the active laryngeal category […] and one which is different in kind.” I agree that there is a clear distinction between the phonetics and the phonology (Section 3.3); I argue that phonological features must be analyzed in terms of contrast and behavior (Section 3.1). However, the discussion of Swedish above attests to the ability of laryngeal realist positions to disambiguate surface features that correspond to substantive phonological representations and those that do not. Finally, I concur that privative analyses require clear parameters to distinguish between active phonological processes on the one hand, and passive phonetic coarticulations on the other. For example, Purnell et al. (2019) examine the symmetrical fronting and backing processes in Old English using privative marking for only front vowels (see Section 3.2). I address this issue further with suggestions for some general characteristics of modular behavior in Section 4.2.

With respect to rhotics, Chabot’s (2019) appeal to a phonological conceptualization that must operationalize their behavior and relationships to other phonemes is sound and duly justified. However, it is precisely these relationships that provide evidence for the (lack of) content of a rhotic’s representations. Whether a phoneme’s representational structure contains or lacks a particular reference to phonetically relevant information is the result of that phoneme’s structural relationship to the other phonemes in a particular inventory (Avery & Rice 1989). To advance the phonological understanding of rhotics, I propose a representational definition for liquids and rhotics as underspecified and unspecified consonantal sonorants, respectively (Section 3.3). The framework for this proposal draws on Modified Contrastive Specification (MCS; Dresher et al. 1994) and a modular organization of the sound system (Purnell & Raimy 2015; Purnell et al. 2019). Phonological representations are based on abstract contrastive relationships between sounds in a given phonemic inventory. I demonstrate that considering liquids and rhotics as a distinct, underspecified class of sonorants accounts for the myriad variants of attested r-sounds without expecting other, more clearly defined phonological categories to behave in a similar fashion. By defining a unique representational class that languages implement for variable rhotic forms, this analysis provides a structural basis for gestural classifications of liquids and r-sounds (e.g. Proctor 2011) and compliments phonological accounts based on their phonotactic patterns (Wiese 2001a; Chabot 2019).

This article is structured as follows: I briefly discuss challenges associated with classifying rhotics into a single natural class in Section 2. In Section 3, I present the theoretical models I draw on to propose a unifying definition of rhotic as a phonological class. Next, in Section 4, I elaborate on the implications of this proposal, particularly regarding different types of rhotic inventories and Chabot’s (2019) insights that rhotics are systemically stable both diachronically and synchronically. I conclude in Section 5.

2 Rhotics as a negative phonological class

Attempts to unify rhotics under a phonological category based on shared articulatory or acoustic properties have proven elusive. Lindau (1985: 167) describes rhotics as showing “more of a family resemblance” to each other, with some overlapping common features, but with no single, unifying property. While considering a range of articulatory and acoustic manifestations of rhotics, Ladefoged & Maddieson (1996: 215) consolidate them based on the orthographic criterion that they correspond to sounds usually written with the grapheme <r>. The difficulty in organizing rhotics into a cohesive phonological class is evident by the variants represented in Table 1, where possible r-sounds range in place from labial to glottal and differ in manner from trills, taps or flaps, fricatives, and approximants; often with voiced and voiceless counterparts as well (Chabot 2019: 13). Additionally, many r-types can occur with pharyngealized variants. For example, alveolar taps/trills, uvulars, or alveolar/retroflex approximants across a range of Arabic dialects can present with pharyngealized and non-pharyngealized forms (Youssef 2019: 5).

Table 1

Summary of rhotic segments (from Chabot 2019: 13).

Labio-dental Dental Alveolar Postalveolar Retroflex Velar Uvular Glottal
Trill r̥ r ʀ̥ ʀ
Tap/flap ɾ̥ ɾ ɽ
Fricative ʃ ʒ ʂ ʐ x ɣ χ ʁ h ɦ
Approximant ʋ ɹ̥ ɹ ɻ̊ ɻ ɰ

Rhotics, however, do share gestural features in that they are produced as constellations of coordinated tongue gestures (Browman & Goldstein 1995). Magnuson (2007) analyzes the interrelatedness of these gestures via the dual vocal tract model, which considers sets of independent primary and secondary articulations in the laryngeal/pharyngeal vocal tract and the oral vocal tract. Furthermore, Proctor (2011) shows that laterals and rhotics in Spanish and Russian activate both coronal and dorsal gestures, finding that the “dark/light” allophony in both languages is the result of different dorsal constrictions, not the presence or absence of the gesture. Sebregts (2014: 233) sees the co-occurrence of coronal and dorsal gestures as a property of taps and trills that gave rise to the retroflex or bunched approximants in Dutch, as well as an articulatory link between alveolar and uvular variants. Liquids in general, and rhotics in particular, may initiate coronal and dorsal gestures; rhotics that fall within the dorsal range in Table 1 (the velars, uvulars, and glottals) lack a surface coronal articulation. Finally, the labio-dental approximant [ʋ], as well as lip rounding as a coarticulation for English /r/ (Keyser & Stevens 2006: 51), indicates that labial features are also accessible for rhotic variants. As a class, r-sounds access and activate labial, coronal, and dorsal articulators to various degrees, and with various types of constrictions. I argue in Section 3.3 that these overlapping, co-occurring articulatory properties result from language-specific, sub-phonological processes.

It is clear that r-sounds employ a range of complex articulations and that those articulatory features serve as distinguishing properties of other phonemic categories. However, rhotics display strikingly similar phonological patterns regardless of their particular surface forms (Lindau 1985; Ladefoged & Maddieson 1996: 216; Wiese 2001a; Stausland Johnsen 2012; Chabot 2019; Natvig & Salmons forthcoming). Therefore, sound systems must operate with rhotic categories that license both their phonological behavior and a range of phonetic outputs. As Wiese explains, “If there are at least eight or nine different r-sounds which are identified in a fairly unproblematic manner, it should be possible to state a reasonably precise definition of what an r-sound is. But this is by no means the case” (2001a: 338).

The disunited properties of rhotics have also long been observed in theoretical work. For example, Trubetzkoy observes that they may better be described by their negative content. For German, he states that /r/ “is not a vowel, not a specific obstruent, not a nasal, nor an l” (Trubetzkoy 1969: 73, cited in Dresher 2009: 44). Accordingly, what is and is not an /r/ is a result of its relationships to other members of a phonological inventory, including which other sounds are represented phonemically, and how rhotics show particular phonological behavior relative to those other phonemes. In order to define /r/ in terms of its phonological patterns, Wiese (2001a: 360) argues that phonotactics (immediately after glides on the sonority scale) best unify rhotics as a class. He suggests that their segmental properties are either irrelevant or highly variable (see also Wiese 2011). Chabot develops this perspective by showing that rhotics behave as sonorants irrespective of their phonetic properties (2019: 11). Furthermore, rhotic variants may change over time without invoking changes to the phonological system. Chabot defines these characteristics as “procedural stability” and “diachronic stability,” respectively (2019: 8–11). These synchronic and diachronic properties lead Chabot (2019) to conclude the phonology does not make reference to features that govern the surface pronunciations of rhotics and that the phonetic variants are mediated by language-specific operations, not linguistic universals.

Although rhotics may be produced with a broad range of major articulatory gestures — labial, coronal, dorsal — their phonemic representations overwhelmingly lack reference to all of those features, if any of them. In the following section, I present a representational definition for r-sounds that models their negative phonological content, as well as their robust articulatory properties and phonetic variation.

3 Phonological representation of rhotics and interfaces within the sound system

I operationalize Trubetzkoy’s (1969) characterization that rhotics are an empty phonological class, existing in opposition to other phonemes. In order to model these relationships, I adopt the position that the phonology defines abstract contrastive relationships (Hall 2007; Dresher 2009) and that these contrasts are marked privatively (Iverson & Salmons 1995; Avery & Idsardi 2001). These categories are subsequently converted into pronounceable phonetic forms through a series of (at least three) modular interfaces (Purnell & Raimy 2015; Purnell et al. 2019). Throughout the remainder of this section, I present this model of the sound system as the foundation for a cohesive phonological representation of rhotics that is sensitive to language-specific phonemic inventories.

3.1 Modified Contrastive Specification

A central issue in understanding phonological representations is determining which features are present, and which phonemes in a language’s phonological inventory those features specify. The core position of what Avery & Idsardi (2001: 45) call the “Toronto School of Contrast” is that phonemes’ phonological representations are based on language-specific contrastive relationships (Dresher et al. 1994). Key to this framework is that phonological inventories drive what is specified and underspecified (Avery & Rice 1989). Therefore, it is not a question of whether the phonological system is phonetically grounded (e.g. Hayes et al. 2004) or substance free (e.g. Hale & Reiss 2008), but how systems organize substance and underspecification for contrastive purposes.

Dresher (2009) formalizes a method for assigning features to an inventory, the Successive Division Algorithm (SDA), arguing that phonological features sequentially contrast phonemic categories one feature at a time until all phonemes are distinct. The framework assumes that the phonology operates only on contrastive features, using phonological activity as the primary justification for proposing which features are contrastive for which phonemes (Hall 2007; Dresher 2009). The SDA is as follows in (1):

(1) The Successive Division Algorithm (Dresher 2009: 16)
  a. Begin with no feature specifications: Assume all sounds are allophones of a single phoneme.
  b. If the set is found to consist of more than one contrasting member, select a feature and divide the set into as many subsets as the feature allows for.
  c. Repeat step (b) in each subset: Keep dividing up the inventory into sets, applying successive features in turn, until every set has only one member.

As an example, Dresher (2009: 45) discusses German /r/ based on its relationship to other consonantal phonemes, shown in Figure 1. The first division of the inventory is the obstruent-sonorant opposition, where the solid line divides the stops, affricates, and fricatives above from the nasals and liquids below. Within the sonorant group, nasals are distinguished from the liquids /l, r/, represented by the inclusion of the former within the box. Finally, /l/ is specified as a lateral (circle), making it distinct from /r/.

Figure 1
Figure 1

German consonantal phonemes (from Dresher 2009: 45).

Another way of representing this sequence of contrasts is through a contrastive hierarchy, as in (2), where [obstruent] < [nasal] < [lateral] distinguish the relevant natural classes. Features in parentheses are presented as oppositions to the specified features in Dresher’s (2009: 45) analysis in Figure 1.

    1. (2)

The motivation for ranking [obstruent] < [nasal] < [lateral] derives from Dresher’s effort to embody markedness within phonological structure. Based on Trubetzkoy’s (1969) characterization of German /r/ as having negative phonological content, the structure in Figure 1 and (2) models unmarked feature values for German /r/ (see Section 3.2 for discussion of the composition of phonological features). Finally, the contrastive hierarchy that results from the application of the SDA to a language’s phonemic inventory consists of both the distinctive features specifying phonemes and sets of phonemes, as well as the order in which these features partition the inventory. The specific composition of a language’s contrastive hierarchy is based on a phonological analysis of that language, not on linguistic universals (although there may be common typological tendencies). I now elaborate on the nature and substance of contrastive features.

3.2 The architecture of phonological contrast

The negative phonological content of German /r/ and, I argue, rhotics more generally is represented here via unmarked privative specifications. Rather than encoding distinctive features with positive and negative values, contrasts are represented by the presence vs. the absence of a given feature (e.g. Mester & Itô 1989; Lombardi 1996). Arguments that draw on the privative composition of features are rooted in analyses for a range of phonological phenomena, including palatalization (Mester & Itô 1989), laryngeal assimilations (Iverson & Salmons 1995; Avery & Idsardi 2001; Allen 2016), vowel diachrony (Purnell & Raimy 2015; Kwon 2019; Purnell et al. 2019), phonetic and phonological contact (Natvig 2019), and historical rhotic patterns in Germanic (Natvig & Salmons forthcoming).

A further insight from this line of research, specifically from Avery & Idsardi (2001) and developed in Purnell & Raimy (2015) and Purnell et al. (2019), is the proposal that the phonology operates on structures that are more abstract than traditional features, what Avery & Idsardi (2001) call dimensions. These are nodes within a unified feature geometry that organize mutually exclusive articulatory gestures (Avery & Idsardi 2001: 44–45). For example, [spread glottis] and [constricted glottis] (gestures for aspiration and glottalization, respectively) fall under the Glottal Width dimension; they are antagonistic gestures of a single muscle group and cannot be activated simultaneously (Avery & Idsardi 2001: 44). The gesture for voicing, [slack], is subordinate to the Glottal Tension dimension, and is antagonistic to [stiff] (voiceless). Dimensions are completed with one of the opposing gestures — or set at a neutral configuration — at a post-phonological level of representation (see Section 3.3). Furthermore, completions may be sensitive to phonological contexts. For example, fortis stops in English are aspirated initially, produced with the [spread glottis] gesture, but may be unreleased and glottalized with [constricted glottis] in codas (Avery & Idsardi 2001: 51–52), as in pin [phɪ̃n] and nip [nɪʔp], respectively. Specifying the fortis stops in English with Glottal Width captures these sub-phonological alternations, as well as English laryngeal assimilations (Iverson & Salmons 1995). The full set of phonological dimensions, with possible completion gestures and their phonetic outputs, from Avery & Idsardi (2001: 66) and advanced in Purnell & Raimy (2015) and Purnell et al. (2019), is provided in Figure 2.

Figure 2
Figure 2

Dimensions, gestures, and phonetics (from Purnell & Raimy 2015: 526).

Purnell et al. (2019) build on this framework by developing a system for marking the unspecified members of privative contrasts with what they refer to as “superordinate null marking.” When a set of phonemes is specified for a dimension, the contrasting set is represented with the dominating node. For example, in a system with nasal sonorants specified for Soft Palate, completed [nasal], the remaining sonorants are marked with the superordinate Root feature [sonorant]SP, i.e. unspecified with respect to Soft Palate (SP). This notation follows the conventions in Purnell et al. (2019) of placing the specified feature in superscript notation on the unspecified superordinate node. Both dimensions and superordinate structures are ordered as a feature chain within a phoneme’s representational architecture based on the application of the SDA (Spahr 2016: 65–66; Purnell et al. 2019: e466). Purnell et al. (2019) show that Old English phonological changes with respect to fronting of /a/, retraction of /æ/, and breaking of /a/ fall out from the stacking of dimensional and superordinate structures of sonorants within a defined phonological domain. I adopt this method for representing specified and unspecified contrastive pairs in liquid and rhotic contexts. The advantage of employing superordinate null marking for representations is that it provides a structure to the unspecified categories. This structure is then built up through conversion processes that produce pronounceable, phonetic forms (Section 3.3). With respect to rhotics, because [sonorant] dominates the Oral Place cavity (see Figure 2), the labial, coronal, and dorsal articulators may be available for non-contrastive articulatory purposes (Section 3.3). This architecture supports findings from e.g. Proctor (2011) that show simultaneous (non-contrastive) coronal and dorsal articulations for liquids.

Based on the feature geometry presented in Figure 2, the relevant dimensions for specifying German sonorants are Soft Palate (completed [nasal]) for /m, n, ŋ/ and Tongue Groove (completed [concave], the lateral articulatory gesture) for /l/. The phonological representation for /r/ is unspecified at the level of the dimension: It is unspecified relative to Soft Palate ([sonorant]SP) and Tongue Groove (CoronalTG). A partial contrastive hierarchy of German consonants with privative dimensions and superordinate null marking is shown in (3). The features [consonant] and [vowel] are assumed to be the first contrastive opposition of the SDA, distinguishing these major classes (Purnell et al. 2019); [sonorant] and [consonant] are not designated for superordinate marking because they both occur as Root features without any dominating nodes. Because superordinate marking represents the unmarked member of a contrastive pair, it does not predict a primary, i.e. categorical and exclusive, articulator node for those segments (Section 3.3). Specifically, the structure in (3) does not require /r/ to have a coronal, rather than uvular, primary place of articulation. Rather, superordinate structures are representations that serve as the input for sub-phonological operations. The particular variant or variants of any phoneme results from the processes that convert contrastive dimensions and superordinate nodes into pronounceable forms. I now turn to a discussion of those processes.

    1. (3)

3.3 Levels of representation and modular processes

Although the phonological dimensions that mark distinctive categories for phonemic representations classify articulatory actions, they are not pronounceable in and of themselves. Not only must each dimension be completed with a gesture, the unspecified categories also require their own additional structure to be rendered pronounceable. Therefore, additional processes interact with abstract representations to convert them into viable phonetic forms. These procedures involve distinct modules, and the conversions of representations of one module to representations in another (e.g. Scheer 2014), with each module operating on its own sets of representations and properties. Purnell & Raimy (2015) argue that the sound system is comprised of at least three distinct levels of representation: The Phonological level, which governs discrete, underspecified phonological categories via contrastive dimensions; the Phonetic-Phonological level, which builds structure into these categories by completing and filling in articulatory gestures; and the Phonetic level, which converts those gestures into continuous units in the speech signal.

Dimensions are completed with one of two possible gestures, or left inert (neutral), at the Phonetic-Phonological level of representation. These completions are both language-specific and context sensitive. They are “part of the work that phonology must do to convert phonologically sparse memorized representations into more fully specified phonetic representations that are the basis for instructions for motor control systems” (Purnell & Raimy 2015: 527). The activation of one muscle — and not its antagonistic pair — for any given dimensional category, along with any meaningful phonological environments (e.g. aspiration and glottalization in English) is part of the acquisition of that language’s sound system and patterns. Furthermore, other gestures (i.e. those corresponding to non-contrastive dimensions) are added at the Phonetic-Phonological level of representation through the enhancement process. Enhancements are non-contrastive (redundant or predictable) features that increase the perceptual saliency of phonemic contrasts (Stevens et al. 1986; Keyser & Stevens 2006; Hall 2011). For example, Tongue Groove (completed [concave]) marks /l/, distinguishing it from unspecified /r/ (CoronalTG). The opposing Tongue Groove gesture [convex] for an alveolar /r/ results in an articulation that is phonetically distinct from the lateral. An inactivated Coronal articulator, producing a dorsal rhotic variant, may also be considered an enhancement to this lateral-rhotic contrast. Accordingly, enhancement is the implementation of non-contrastive features at the Phonetic-Phonological level of representation, facilitating the pronounceability of underspecified categories.

Differentiating between gestures that result from the completion of contrastive dimensions and enhancements has valuable implications for liquid and rhotic representations. The completion and implementation of contrastive dimensions display categorical properties, whereas enhancements tend to be variable (Keyser & Stevens 2006; Hall 2011). Therefore, the processes that fill in structure to the unspecified member of a contrastive set are more subject to geographical and social variability. The more underspecified a phonological category, the broader the range of phonetic variants it can express. A representation of r-sounds that requires no categorical place or manner dimension structurally captures that range of surface variability. The specific manner and place features are negotiated sub-phonologically via regional, social, and contextual factors. For example, German varieties present a wide range of /r/ phones, four of which include coronal [r, ɽ] and dorsal [ʀ, ʁ] (Salmons 2018: 318, based on Göschel 1971 and Wiese 2003), all of which are licit phonetic outputs based on the representations presented in (3). Furthermore, Wiese (2001b) finds changes from one generation to the next between coronal and dorsal /r/ with no clear directionality. The shallow time depth in which these changes occur, along with the lack of evidence that they alter the system of contrasts for German consonants, supports a negative specification for German /r/. Table 2 schematizes the three levels of representation with German /l/ and /r/ as comparative examples. Articulations in italics are enhancements; plain text gestures are those that complete a contrastive dimension. Additional information added at the Phonetic-Phonological level is excluded from Table 2, including the dependents of the Dorsal and Coronal nodes and their corresponding gestures (possible enhancements in parentheses), as well as information that defines manners of articulation (trill vs. fricative). All major /r/ types have options for co-occurring coronal and dorsal (and possibly labial) gestures; for the uvular phones [ʀ, ʁ] no Coronal articulation is realized. With respect to /l/ and /r/, the contrastive hierarchy in (3) and the levels of representation in Table 2 broadly capture contemporary and historic liquid patterns in Germanic (e.g. Natvig & Salmons forthcoming). Specifically, /r/ demonstrates a higher degree of phonetic/phonological arbitrariness than its lateral counterpart.

Table 2

Levels of representation for German liquids.

Level of representation /l/ /r/
(Dimension: Discrete)
Tongue Groove
(Gestures: Completions, enhancements)
(Implementation: Continuous)
[l] [r], [ɽ] [ʀ], [ʁ]

The contrastive hierarchy in (3) depicts the negative phonological content of /r/. It is unspecified relative to nasals, as part of a liquid class composed of /l, r/ ([sonorant]SP), and it is again unspecified against the lateral /l/, specified for Tongue Groove (CoronalTG). As a consequence, this /r/ is an unspecified liquid. I draw on these phonological relationships to put forward a contrastive definition for both liquids and rhotics in (4). I assume here that contrastive features for glides are either based on corresponding vowel representations, or that they fall under their own, independent natural class (see Nevins & Chitoran 2008).

(4) Representational definitions of LIQUID and RHOTIC
  a. A LIQUID is any phoneme, or set of phonemes, that belongs to the class of non-nasal consonant sonorants, where present. If no nasal sonorants are present, LIQUID comprises the only class of consonant sonorants. The LIQUID class is represented via the phonological marking [consonant] [sonorant] ([sonorant]SP).
  b. A RHOTIC is the LIQUID phoneme that does not receive any dimensional marking in its phonological representations. RHOTIC is always the unmarked member within any LIQUID set.

Both the LIQUID and RHOTIC categories refer to underspecified sets within a contrastive structure.4 They do not determine liquid surface forms, but describe a class and subclass characterized by both broad phonetic variation and relative phonological stability. I define RHOTIC as the dimensionally unspecified LIQUID phoneme, not as a phonetic category of r-sounds. Whether a language presents the RHOTIC category as an r-sound, l-sound, or any other sonorant that patterns as a liquid is the result of that language’s organization of contrastive substance and the sub-phonological processes that convert underspecified categories into phonetic objects. Although the examples of r-sounds discussed here behave consistently as the RHOTIC subset, this does not need to the case for every language. For example, an r-sound may be specified within the LIQUID set, contrasting against an unspecified liquid (see the discussion of Arabic rhotics in Section 4.1). In such an alignment, this model predicts that the specified rhotic will participate in phonologically active processes and the unspecified liquid will be subject to more variability.

The definitions in (4) do not require any phonological content limiting the places or manners of articulations for r-sounds, other than that they are phonologically sonorants, expressed with a continuant manner. The degree of relative closure (spanning fully sonorous to partial obstruction) is mediated through post-phonological operations. The definitions stipulate that /r/ is not specified for phonological content, i.e. dimensions, which crucially depends on the phonemic makeup of consonantal sonorants for any given language. This proposal captures the Hall’s (1997) attempt to unify rhotics as a phonological category via [+rhotic] without proposing a feature for only this purpose. Likewise, the superordinate structure that falls out of (4) is in line with Walsh Dickey’s (1997) proposal to model coronal and dorsal rhotic feature geometry, however here as nodes that accommodate language-specific structure building in accordance with the more variation-prone properties of sub-phonological modules of representation.

Chabot (2019), however, understands the complexities of producing trills and taps as a motiving factor for rhotic variation, stating:

Trills are phonetically difficult segments to articulate, and the variation they are subject to is the manifestation of speaker strategies to overcome the difficulty they pose. This difficulty is not phonological, it is strictly phonetic and speakers readily adopt variations of rhotic phones in order to cope with the impediments imposed by phonetics. (Chabot 2019: 14)

However complex these articulatory combinations may be, they do little to explain other patterns of structured variation. Take, for example American English /u/, which varies regionally and socially across the horizontal plane of the vowel space (Jacewicz et al. 2011). This variation occurs relative to an unspecified representation in the present framework, in which /i/ is specified for Tongue Thrust against unmarked /u/ (Purnell & Raimy 2015; Purnell et al. 2019). Furthermore, Natvig (2018) finds that patterns of variance in Norwegian vowels’ first and second formant (F1, F2) trajectories correspond to underspecified, yet substantive, phonological representations. It is unclear that a difficulty with producing a vowel relatively higher, lower, further back, or further forward would contribute to this type of variable pattern. This system of underspecified contrastive representation models these socially and contextually indexed variations. While physiological factors may contribute to the wide range of variants for rhotics and other complex sounds, considering that variation in relation to phonological representations situates it within a wider set of structured, heterogeneous patterns.

The contrastive hierarchies in (5) and (6) depict the structural relationships between LIQUID and RHOTIC. In (5) the phonemic inventories lack a contrast between /l/ and /r/. In the contrastive hierarchy in (5a), liquids are the only consonantal sonorants, whereas (5b) includes nasals. For both, LIQUID and RHOTIC comprise a single node, contrasting with any specified nasal sonorant, if present, and is itself unspecified. Because Tongue Groove is not present in these inventories, there is no phonological condition that prohibits a lateral variant within a LIQUID = RHOTIC contrastive arrangement. The particular form, whether lateral or not, is mediated through language-specific conversion and implementation processes (see Section 3.3).

    1. (5)
    1. Sample contrastive hierarchies with LIQUID = RHOTIC structure
    1. a.
    1. b.

In contrast to (5), the phonological inventories represented in the contrastive hierarchies in (6) both have distinct /l/ and /r/ phonemes, marked by the Tongue Groove specification for /l/. Here, RHOTIC is a subset of LIQUID. The contrastive hierarchy for German consonants in (3) is structurally identical to the contrastive hierarchy in (6b), and the surface form of /r/, whether characterized as coronal or dorsal for example, is negotiated through the same sub-phonological processes that influence the output of the LIQUID = RHOTIC category in (5).

    1. (6)
    1. Sample contrastive hierarchies of LIQUID and RHOTIC distinction
    1. a.
    1. b.

In this section, I presented arguments for defining the category RHOTIC as the unspecified consonant sonorant by appealing to underspecified phonological representations and the relationship between specification and variability in a modular sound system. This position both unifies rhotics under a cohesive phonological category and models the widespread surface variability of r-sounds as a consequence of their representation within a phonemic system. In the next section, I show how these structures may characterize languages with varying numbers of liquid phonemes and discuss the relationship between RHOTIC representation and Chabot’s (2019) notions of diachronic and procedural stability.

4 Predictions and implications

Treating RHOTIC as an underspecified category within the present modular framework assumes that the RHOTIC phoneme lacks contrastive features for phonological alternations. Furthermore, systems without r-sounds and those with multiple rhotics each need to be taken into account. For instance, Hawaiian lacks an /l-r/ distinction and both Malayalam and Arabic have multiplex LIQUID systems. These inventories shed light on the relationships between contrastive underspecification, phonological activity, and modular operations within the sound system. This analysis further provides structural explanations in support of Chabot’s (2019) findings that rhotics demonstrate both diachronic and procedural stability. In the following subsections, I consider each of these issues in turn.

4.1 Modeling diverse liquid systems

Some languages, Hawaiian for example, have a LIQUID = RHOTIC system and do not have an r-sound. The Hawaiian consonant inventory has the stops /p, k, h, ʔ/ and the consonant sonorants /m, n, l/ (Herd 2005: 97). Following the contrastive hierarchy for sonorants in (5), the nasals /m, n/ are marked for Soft Palate, leaving /l/ unspecified ([sonorant]SP) in the proposed Hawaiian representations in (7).5 In this particular case, /l/ is the unspecified sonorant consonant, satisfying the definitions for both LIQUID and RHOTIC in (4). The fact that the unspecified phoneme in this class occurs as a lateral, and not a classic r-sound, is a result of the language-specific processes that convert the RHOTIC category into a phonetic form. Because [r] and [l] do not contrast in Hawaiian, there is no operation to distinguish the two phones into distinct categories (see Lahiri & Reetz 2002; 2010 for a model mapping phonetic forms to underspecified representations). Compare New Zealand Māori, another Polynesian language with a LIQUID = RHOTIC organization, but with a surface form [r] (Herd 2005: 101). In this case, the Phonetic-Phonological level operations convert the unspecified sonorant category into a rhotic variant instead of a lateral one.

    1. (7)

In contrast to Hawaiian and New Zealand Māori, Malayalam requires specifications to distinguish five liquids /l, ɭ, ʐ, ɾ, r/, where /ʐ/ is considered a post-alveolar approximant (Punnoose et al. 2013). In Malayalam it appears that the distinction between laterals and r-sounds is less salient than between the retracted and advanced, or “dark” vs. “clear” sets (Punnoose et al. 2013: 282, 294). Accordingly, I propose the contrastive hierarchy of Malayalam liquids in (8), where the dimension Tongue Root, completed [RTR] (Retracted Tongue Root), contrasts dark /r, ɭ/ from clear/ʐ, ɾ, l/ as the first contrast within the LIQUID group. The dark phonemes /r, ɭ/ are realized phonetically with a pharyngealized resonance (Punnoose et al. 2003: 276), characteristic of an [RTR] gesture (see Figure 2). Within each of those sets, Tongue Groove, completed [concave], distinguishes the laterals /ɭ/ and /l/ from /r/ and /ʐ, ɾ/, respectively. Tongue Curl, completed [up], marks /ʐ/; /ɾ/ is unspecified for dimensions and the RHOTIC based on (4). Finally, although /ʐ/ tends to lack sublaminal contact like other Malayalam retroflex consonants, “the type of weak retroflexion […] in [/ʐ/] may be due to both its approximant nature and its clear resonance” (Punnoose et al. 2013: 292). Therefore, Tongue Curl ([up]) contributes here to the phoneme’s post-alveolar place of articulation even if it is not retroflex to the same degree as other retroflex sounds in Malayalam.

    1. (8)

These phonological specifications model two interactions between liquids and vowels as a product of the liquids’ contrastive specifications. Namely, /ʐ/ aligns with clear /ɾ, l/ in one pattern, but with dark /r, ɭ/ in another. First, Punnoose et al. (2013) find significant differences in F1 and F2 of vowels preceding and following /ʐ, ɾ, l/ compared to /r, ɭ/. Specifically, vowels have lower F2 (further back) and higher F1 (lower) in /r, ɭ/ contexts than in /ʐ, ɾ, l/ ones (Punnoose et al. 2013: 285–292). The Tongue Root specification that groups /r, ɭ/ as a natural class, and completes as [RTR], induces vowel backing and lowering relative to the unspecified ([sonorant]TR) class /ʐ, ɾ, l/. Other the other hand, although /ʐ/ patterns with /ɾ, l/ with respect to their influences on the horizontal and vertical positions of neighboring vowels, it also causes a decrease in the third and fourth formants (F3, F4) with /r, ɭ/. These are the acoustic effects of retroflex-like articulations (Hamann 2003). Retroflexion is the result of a variable set of complex, gradient gestures, including raising, flexing, and or retracting the tongue tip, retracting the tongue root, lowering the tongue body, and (sub)laminal constriction (Punnoose et al. 2013: 277). The lowering of F3 and F4, then, is an acoustic outcome of the Tongue Root specification, and [RTR] completion, for /r, ɭ/, but the Tongue Curl ([up]) feature for /ʐ/. Both of the specifications are related to retroflexion gestures (retracting the tongue root and raising the tongue tip for Tongue Root and Tongue Curl, respectively) and each may be enhanced further by other interrelated gestures.

Similar to Malayalam’s dark and clear liquid classes, many Arabic varieties have emphatic and non-emphatic r-sounds. Emphatic /rˤ/ in Arabic is produced with an “articulation involving the retraction of the tongue dorsum toward the pharyngeal wall” (Youssef 2019: 5) and, when phonemic, triggers emphasis spreading, or “bidirectional, long-range pharyngealization” that targets vowels and consonants (Youssef 2019: 10). Here emphasis spreading again demonstrates the contrastive status of the Tongue Root dimension, completed with [RTR], for the emphatic phonemes. In a survey of phonological representations for rhotics across a range of Arabic dialects, Youssef (2019) finds four major groups, shown in (9):

(9) a. non-emphatic /r/
  b. emphatic /rˤ/
  c. both non-emphatic /r/ and emphatic /rˤ/
  d. both uvular /ʁ/ and tap-trill /r/

The uvular in (9d), however, has undergone a merger with the fricatives and no longer patterns as a rhotic with respect to sonority (Youssef 2019: 28). Therefore, the /r/s in (9a) and (9d) do not contrast with other r-sounds and are consistent with the RHOTIC category defined in (4). The contrast between /r/ and /rˤ/ in (9c) occurs via Tongue Root specification for the latter. It is the occurrence of emphatic /rˤ/ in (9b) — showing clear evidence of phonological marking by triggering emphasis spreading (Youssef 2019: 13–14), but without an unspecified /r/ counterpart — that illustrates the contrastive relationships among Arabic sonorant phonemes and the ways in which these contrasts vary across dialects.

Consider the contrastive hierarchy with emphatic and non-emphatic liquids in (10). Tongue Root is again specified above Tongue Groove within the LIQUID class, contrasting emphatic liquids against the unmarked, non-emphatic set.6 Varieties with inventories like that in (9c), then, have a Tongue Groove specification under the [sonorant]TR node, resulting in phonemic representations for both non-emphatic /l/ and /r/. In contrast, /rˤ/ in (9b) is specified for Tongue Root and the unspecified RHOTIC category has a lateral surface form. That is, there is no Tongue Groove specification under either Tongue Root or [sonorant]TR for (9b). In a contrastive alignment with /l, rˤ/ as the only LIQUID segments, and with /rˤ/-triggered alternations, /l/ is the unspecified consonant sonorant. The lateral, and not the r-sound, in this arrangement satisfies the definition of RHOTIC in (4) because of the Tongue Root dimension’s phonological activity and, accordingly, its role in defining contrasts within a particular phonemic inventory.

    1. (10)

Youssef (2019) provides strong evidence for phonological activity driving the underlying representations of Arabic rhotics, with emphasis spreading a principal component of that activity. In this framework, emphasis spreading with /rˤ/ as a trigger indicates Tongue Root in its phonological representation. The broader variation among Arabic varieties can be understood in terms of whether the LIQUID class is marked for Tongue Root and whether those emphatic and non-emphatic classes are further specified, e.g. for Tongue Groove. These representations are consistent and compatible with the feature geometry supported in Youssef (2019), with the advantage that the present framework considers, in addition to the phonologically active features of these r-sounds, their relationships to the rest of the phonemic systems to which they belong.

Although Hawaiian, Malayalam, and Arabic differ in terms of their sonorant and liquid inventories, they demonstrate the value of considering representations based on language-specific LIQUID and RHOTIC patterns and treatments of those categories. The proposed contrastive hierarchies for these languages are hypotheses that require further investigation of each language’s phonological systems, as well as additional cross-linguistic testing, but they provide a means for analyzing rhotics as a negatively defined category and for explaining the distributions of LIQUID surface forms along with the variety of articulations available for their realizations.

4.2 On diachronic and procedural stability

The position that rhotics (or laterals without a contrast among liquids) occupy the unspecified category of consonantal sonorants has diachronic implications. Specifically, because of their unspecified nature, surface forms can vary and change without disrupting the relationship that these phonemes have with the rest of the representational system. Chabot refers to this as diachronic stability, specifically that “rhotics can vary on the diachronic axis without provoking a realignment in the phonological system” (2019: 1). I argue that it is the very representational structure that contributes to that phonological stability over time.

One outcome of the RHOTIC definition in (4) is that, depending on the makeup of the rest of the phonemic system, a host of changes in place and manner may be changes in the outcomes of building structure into phonologically unspecified domains. They are changes at the Phonetic or Phonetic-Phonological levels of representation, not in the contrastive, Phonological level dimensions. Take German again as an example in Table 3, where four major variants reflect differences in the implementation of the unspecified side of contrastive sets (see (3)). Like the examples in Table 2, the primary difference between coronal and dorsal surface forms of /r/ is that the latter are not produced with coronal constriction. Variation among the coronals results from — at the very least — differences in the specific position of the tip of the tongue, curled upward (TC-[up]) for retroflex [ɽ]. The uvular variants [ʀ, ʁ] are articulated with the [back] gesture dominated by the Tongue Thrust (TT) node, with the latter produced with a [fricative] gesture (OP-[fricative]).

Table 3

Levels of representation major variants of German /r/.

Phonological Phonetic-Phonological
/r/ [r] [ɽ] [ʀ] [ʁ]
(Oral Place)
(Oral Place)
(Oral Place)

Diachronic changes of /r/ in some varieties of German demonstrate no clear directionality: “While alveolar [r] changes into uvular [ʀ], the reverse is found as well” (Wiese 2001b: 21). Furthermore, they are the result of changes not in representations, but in the processes that render the phonological category pronounceable, which are subject to social and individual variation (Hall 2011; Natvig 2018). In this type of sound change that targets rhotics, there are no changes to any contrastive features and, therefore, no change to, or realignment of, the phonological system.

There are, however, sound changes involving rhotics that do alter the phonological system. The loss of a distinction between /l, r/ in Polynesian is one such example. In this case, the products of this merger are either [r] or [l] (see Section 4.1). Based on Herd (2005) and previous discussions, I propose the contrastive hierarchy for Proto-Polynesian, with an /l-r/ contrast, in (11).

    1. (11)

In this change from Proto-Polynesian to Hawaiian and New Zealand Māori, Tongue Groove is lost from the set of contrastive features and the earlier lateral-rhotic distinction is merged into a single category with either [l] or [r] as a viable form, i.e. a LIQUID = RHOTIC alignment. The difference between Hawaiian and New Zealand Māori, then, is that in Hawaiian the [concave] completion of Tongue Groove is realized as an enhancement at the Phonetic-Phonological level of representation for the new unspecified category, but lost in New Zealand Māori where the conversion processes that rendered the Proto-Polynesian RHOTIC category pronounceable remain. The merger of /l/ and /r/, with differential adoption of the liquid variant from Proto-Polynesian, is shown in (12).

    1. (12)

This kind of change supports Oxford’s Sisterhood Merger Hypothesis, where structural mergers occur at the lowest level within a contrastive hierarchy, i.e. “contrastive sisters” (Oxford 2015: 315). As this example further demonstrates, the loss of the Tongue Groove specification does not mean that /l/ becomes /r/. Rather, the merger produces the structural circumstances in which [l] and [r] may exist as allophones, either conditioned or unconditioned. If the situation is such that the lateral and the rhotic are in free variation, one may occur as a prototypical (or default) reflex of the unspecified category, mediated through language-specific conversion and implementation processes, not representations. For any merger, or loss of contrast, in the present framework the more unspecified category remains at the expense of the more specified one. Because RHOTIC comprises the unspecified set among consonant sonorants, the category will always exist unless there is a complete loss of consonant sonorants. This is the category that provides the representational conditions for a phoneme to behave like a liquid generally, and a rhotic specifically. Furthermore, this category will be diachronically stable with respect to representation, irrespective of changes in its conversions and, therefore, phonetic reflexes.

According to this model, historical changes to the pronunciation of these rhotics are not due to changes in phonological features (i.e. dimensions), because the category lacks dimensional specification.7 This amount of underspecification further contributes to the procedural stability that characterizes rhotics, or that they “are implicated in phonological processes [that] can vary in a phonetically arbitrary manner without perturbing the process itself” (Chabot 2019: 1). The implication for the present model is that processes involving the RHOTIC phoneme occur at a sub-phonological level of representation. This is a specific hypothesis about the influences that the RHOTIC phoneme has on other sounds within a given phonological domain, and testing it is an extremely promising area of future research.

An example of an alternation process that targets a sub-phonological level is a retroflexion pattern in certain varieties of Norwegian. This is a variable process in which /r, ɽ/ preceding the laminal coronals /th, t(d), s, l, n/ across word and morpheme boundaries coalesce into retroflexes (Kristoffersen 2000: 96–97, 315–316; Johannessen & Vaux 2013: 49). This alternation also includes some varieties that have uvular /r/ in isolation (Stausland Johnsen 2012). Patterns of retroflexion in a number of environments are shown in Table 4. Note that Johannessen & Vaux (2013: 54) argue that there is considerable variation with respect to whether retroflexion occurs, concluding that /r/ allomorphy and, therefore retroflexion itself, is sensitive to phonological, as well as individual, geographical, and social factors. The multiple variants of the compound soldag ‘sun(ny) day’ are one example.

Table 4

Norwegian retroflexion patterns (adapted from Kristoffersen 2000: 96–97; Johannessen & Vaux 2013: 49).

Environment Orthography Isolated pronunciation Output Gloss
root + inflectional affix sur-t [1sʉːɾ] [1sʉːʈh] sour-NEUT.SG
root + derivational affix kjør-sel [1çɵːɾ] [1çɵʂ.ʂl̩] drive-NMLZ
word + clitic spør-n [1spɵɾ] [1spɵɳ] ask-him
compound sol-dag [1suːɽ] or [1suːl] [2suː.ɖɑːɡ]
sun(ny) day
verb + object har tid [1hɑːɾ] [1hɑ(ː) 1ʈhiː] has time

Kristoffersen (2000: 96ff) views retroflexion as a subset of related phonological rules, where an apical feature spreads from the rhotic to the following coronal, and the rhotic’s remaining features, and therefore the segment itself, delete. The outcome is a single retroflex phone, which represents the fusion of two adjacent segments into one (Kristoffersen 2000: 96). There are a number of constraints to the application of this rule depending on the domain of application — tautomorphemic vs. heteromorphemic for example — that interact with suprasegmental patterns and representations (Kristoffersen 2000: 99, 317). However, in all cases the retroflexion process results in the loss of the source rhotic segment on the surface. This is represented not only in the forms for soldag ‘sun(ny) day’ in Table 4, but across word boundaries as well, as in (13). These examples demonstrate higher-order restrictions on the domains in which retroflexion can apply. Notice the relationship between r-deletion and retroflexion. When retroflexion can and does apply, the /r/ coalesces with the following coronal.

    1. (13)
    1. a.
    1. Per
    2. ‘Per
    1. ser
    2. sees
    1. en
    2. a
    1. stor
    2. big
    1. løve.
    2. lion.’
    1. [1pheːʂeː.ɾɛn. 1stuː. 2ɭɵː.ʋə]
    1. b.
    1. Per,
    2. ‘Per,
    1. Siris
    2. Siri’s
    1. bror,
    2. brother,
    1. ser
    2. sees
    1. en
    2. a
    1. stor
    2. big
    1. løve.
    2. lion.’
    1. [1pheːɾ. | 2siːɾis. 1bɾuːɾ | 1seː.ɾɛn. 1stuː. 2ɭɵː.ʋə]
    2. ??[ 1pheː | 2ʂiːɾis. 1bɾuː | 1ʂeː.ɾɛn. 1stuː. 2ɭɵː.ʋə]

Uvular /r/ varieties with retroflexion can be seen in the sample phrases [ɡɔː.ʁʉːth] går ut ‘go out’ and [ɡɔː.ɖɪk.khə] går det ikke ‘if it’s not possible’ (Stausland Johnsen 2012: 514). Retroflexion with /r/ as a trigger, however, does not reflect the spreading of an inherent (phonetic or phonological) feature of the coronal or uvular r-sounds in Norwegian. There is no indication that Norwegian retroflexion draws on the categorical phonological representations of the conditioning /r/, a fact that supports the unspecified representation of /r/.

Due to its variability and non-assimilatory characteristics, Norwegian retroflexion is more consistent with a conditioned enhancement of coronals, operating at the Phonetic-Phonological level of representation. Whether it occurs or not is a function of the variable nature of the representations and structures for that particular module, i.e. enhancement. The contemporary pattern, however, requires a diachronic explanation. It likely emerged from a phonological rule in which retroflex features of /ɽ/ spread to following coronals, which then extended to environments following /r/ (see Stausland Johnsen 2012: 511–513 for a discussion). Take for instance the partial contrastive hierarchy in (14) for a Norwegian variety with phonemic /ɽ/ marked with Tongue Curl (completed [up]) to contrast it against /r/.

    1. (14)

The first stage of this change is an adaptation of Kristoffersen’s (2000: 96) retroflex rule into the presentation framework. Here, /ɽ/ spreads its Tongue Curl dimension to the following coronal and then deletes, as in /kʉɽ-th/ [1ɡʉːʈh] ‘yellow (NEUT.SG)’, cf. [1ɡʉːɽ] in isolation (Kristoffersen 2000: 96). The deletion, or lack of a retroflex flap reflex, in these environments introduces ambiguity into the speech signal. Based solely on surface forms, the retroflex gesture (Tongue Curl [up]) can be interpreted by learners either as an assimilation from /ɽ/ followed by deletion of that segment, or an enhancement of the coronal in that same context. The articulatory gestures that derived from the phonological specification of /ɽ/ and spread to a following coronal are aligned with that segment and reanalyzed as filling in structure at the Phonetic-Phonological level of representation. This type of change is what Blevins (2004) defines as “chance,” where:

The phonetic signal is accurately perceived by the listener, but is intrinsically phonologically ambiguous, and the listener associates a phonological form with the utterance which difference from the phonological form in the speaker’s grammar. (Blevins 2004: 32)

In this instance it is not the phonological representations that change, but the conversion processes that underlie them. Due to this reanalysis, this gesture [up] no longer completes the Tongue Curl dimension for /ɽ/, which results in a disruption of the conversion from the Phonological to the Phonetic-Phonological levels of representation. Therefore, /ɽ/ is not converted into a pronounceable segment. This unimplemented /ɽ/ representation provides the environment for the retroflex enhancement of coronals. The fact that compounds such as soldag ‘sun(ny) day’ in Table 4 only have retroflexion in connection with the loss of the preceding liquid is consistent with these bare representations conditioning the process. This bare environment domain, i.e. unimplemented phoneme, extends from [consonant, sonorant, Tongue Curl] to [consonant, sonorant], thereby including /r/.

This diachronic process is represented in Figure 3. /T/ is any non-retroflex laminal phoneme, with the assumption that the Coronal articulator, either as superordinate structure or as the dominating node of a coronal dimension, defines the natural class. Boxes underneath each linked X segment show relevant structure across three levels of representation; the intervening {#} signifies the boundary (word, morpheme, clitic) between the sources and triggers of the process. The initial retroflex rule is presented in 1, with the conditioned enhancement in 2, and its extension to include /r/ in 3. The last stage, 3, results in a sub-phonological operation that spreads to other Norwegian varieties, including those that do not have the retroflex flap phoneme and to those that have the uvular /r/ in isolation. Stage 3 models the synchronic pattern: Coronals are enhanced with Tongue Curl (TC) [up] following the unimplemented CoronalTG phonemes. Finally, because stem-final r-deletion spans most of the Norwegian-speaking area (see Papazian & Helleland 2005: 76–77; Johannessen & Vaux 2013), there were and are ample conditions for varieties that historically and presently lack the contrastive features for retroflexion to adopt this enhancement. Other types of sub-phonological conversions and operations have been shown to be particularly susceptible to transfer in contact scenarios (Natvig 2019), and there is reason to believe that dialect or sociolect contact undergoes similar processes.

Figure 3
Figure 3

Diachrony of Norwegian retroflex enhancement.

Norwegian retroflex enhancement requires a considerable amount of machinery at the Phonetic-Phonological level of representation. However, this reanalysis is the same type of rule telescoping that other accounts posit (e.g. Stausland Johnsen 2012), but here resulting in Phonetic-Phonological level change instead of Phonological change. Furthermore, the forms, features, and distributions of the enhancement are consistent with sub-phonological processes:

  1. The retroflexion gesture, Tongue Curl [up], does not categorically specify or describe the conditioning phoneme ([ɾ]/[r] or [ʁ]) in isolation (Stausland Johnsen 2012).

  2. The process requires the loss of the previous liquid; it does not co-occur with surface-level conditioning phonological features or phonetic co-articulations ([2suːɽ.dɑːɡ] vs. [2suː.ɖɑːɡ]; Kristoffersen 2000: 96).

  3. It does not always occur when conditioning factors are met (Johannessen & Vaux 2013).

  4. Conditioned Phonetic-Phonological operations are attested elsewhere, where they may be optional (e.g. English unreleased/glottalized final stops in coda position: [næˀp] and [næph] are possible pronunciations for nap).

These observations are some of many possible metrics for evaluating where a given sound pattern may occur within a modular sound system. It is also worth investigating how these types of enhancements integrate with more detailed modular conceptualizations of the sound system (e.g. Keyser & Stevens 2006: 39). They are, however, demonstrations of how LIQUID and RHOTIC as underspecified phonological categories participate in alternations. These illustrations further show how underspecified representations model and predict the extent to which their isolated phonetic forms are opaque to those processes, both synchronically and diachronically (see Howell 1991; Wiese 2001b; Natvig & Salmons forthcoming).

I presented here one illustration each of diachronic and procedural stability of the RHOTIC category, crucially drawing on its lack of phonological content. In both cases, the stability of the category with respect to highly variable surface forms is the result of sub-phonological processes that govern those variants. In sum, both the diachronic and synchronic stability of rhotics phonologically and the diachronic and synchronic variation of rhotics phonetically are not conflicting or incompatible properties, but interconnected outcomes based on the negative phonological content of the RHOTIC category, as defined in (4).

5 Conclusion

There are no exclusive acoustic or articulatory features that contrastively define rhotics. However, it is their synchronic and diachronic behaviors relative to other phonemes that characterizes them phonologically (Chabot 2019). Considering the wide range of rhotic-patterning segments across dimensions of place and manner, I argue here that a category RHOTIC is the unspecified sonorant consonant in the phonological system, where LIQUID is a broader underspecified category that may encompass multiple phonemes. The particular surface form of RHOTIC is a direct result of its relationship to other potential LIQUID phonemes and their phonological properties, as well as the computations to make all underspecified representations pronounceable. These processes contribute to the extreme degree of rhotic arbitrariness across phonological and phonetic domains. The underspecified definitions presented and supported here bring liquid and rhotic phonological structure in line with gestural accounts and descriptions (Proctor 2011), while defining discrete categories that exhibit consistent phonological behavior (Wiese 2001b; Chabot 2019; Youssef 2019).

Although non-linear relationships are characteristic of cross-modular interactions (Purnell 2009), other phonemic categories do not vary with respect to manner and place to the same extent as what is not only possible, but a hallmark, for r-sounds. The interrelated models in this analysis capture this observation not based on intrinsic or universal properties of different types of sounds, but in the relationships that those sounds have with the others in a given phonological inventory. Specifically, the degree to which any particular phoneme can vary phonetically is a direct result of its phonological specification and the representations that distinguish the other phonemes in an inventory. The more phonemes and the more distinctive features a system and a phoneme have, the more restricted that phoneme’s phonetic variants may be. The extent of /r/ variation is directly related to that language’s representational structure. Finally, rhotic underspecification permits the variation of r-sounds over time and space, but it does not require it. Any individual phonetic form and the amount of contextualized or free variation it has are governed by conversion processes at sub-phonological levels of representation. Given the appropriate phonological, phonetic, and social conditions, there are very few structural restrictions to what may be the archetypal allophone of an unspecified phoneme in a particular phonemic set.

In this analysis, I advance the representational underpinnings of cross-linguistic rhotic behavior and their complex articulatory properties. Furthermore, this investigation supports hierarchically organized phonological representations within a modular sound system, each module operating on different types of representations with their own properties and characteristics. Finally, I demonstrate a way forward for interpreting and analyzing diachronic and synchronic rhotic and liquid patterns predicated on their unspecified, or underspecified, representations. These hypotheses deserve further cross-linguistic testing to enrich our understanding of the phonology of rhotics and liquids, as well as phonological theory in general.


# = word, morpheme, or clitic boundary, ATR = Advanced Tongue Root, F1 = first formant, F2 = second formant, F3 = third formant, F4 = fourth formant, MCS = Modified Contrastive Specification, NEUT = neuter, NMLZ = nominalizer, OP = Oral Place, RTR = Retracted Tongue Root, SDA = Successive Division Algorithm, SG = singular, SP = Soft Palate, T = coronal consonant, TC = Tongue Curl, TG = Tongue Groove, TR = Tongue Root, TT = Tongue Thrust, X = segmental timing slot


  1. Note that this contrast is represented with phonemic aspiration as /ph, th, kh/ <p, t, k> (fortis) against plain voiceless /p, t, k/ <b, d, g> (lenis) and that the lenis series may passively receive voicing (see Allen 2016). [^]
  2. Norwegian has tonal accents (see Kristoffersen 2000: 233–298) that do not interact with processes discussed here. They are excluded when they are not provided in cited examples. [^]
  3. Preaspiration and postaspiration in Norwegian codas are variable and sensitive to regional and phonological contexts (e.g. Endresen 1991: 61; Allen 2016). I transcribe the neuter form trygt with aspiration between [k] and [t] to show that the final consonant cluster shares a [spread glottis] gesture; for some speakers, aspiration may begin prior to the stop closure for [k]. [^]
  4. Small caps are used to indicate phonological categories, without referring to specific surface forms. [^]
  5. See, however, Herd (2005: 99), who includes the glide /w/ with consonantal sonorants, specifying [labial] above [nasal]. [^]
  6. The contrastive hierarchy in (10) includes /lˤ/ to show how an emphatic lateral would be specified, but it is not required for the present analysis. [^]
  7. In a system with multiple r-sounds, diachronic change may be phonological, resulting in mergers or splits with other rhotics and liquids, or even into a different major class like the reanalysis of uvular /ʁ/ in Arabic as a fricative (Youssef 2019: 28). The latter change exemplifies a type of Oxford’s (2015: 317) Segmental Reanalysis Hypothesis, where “a segment may be reanalyzed as having a different contrastive status.” In this case a previously unspecified manner of articulation (i.e. frication) is reanalyzed as a phonological feature. [^]


I would like to thank Samantha Litty, Eric Raimy, and Joe Salmons for discussions and comments on previous versions of this article. I am also very appreciative of the feedback I received from the Glossa editorial staff and the anonymous reviewers. All mistakes and oversights are my own.

Funding Information

This work has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement number 838164. It is also partly supported by the Research Council of Norway through its Centres of Excellence funding scheme, project number 223265.

Competing Interests

The author has no competing interests to declare.


Allen, Brent. 2016. Laryngeal phonetics and phonology in Germanic. Madison, WI: University of Wisconsin–Madison dissertation.

Avery, Peter & Keren Rice. 1989. Segment structure and coronal underspecification. Phonology 6(2). 179–200. DOI:  http://doi.org/10.1017/S0952675700001007

Avery, Peter & William J. Idsardi. 2001. Laryngeal Dimensions, completion and enhancement. In Tracy Alan Hall (ed.), Distinctive feature theory, 41–70. Berlin: Mouton de Gruyter.

Blevins, Juliette. 2004. Evolutionary phonology: The emergence of sound patterns. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511486357

Browman, Catherine P. & Louis M. Goldstein. 1995. Gestural syllable position effects in American English. In Fredericka Bell-Berti & Lawrence J. Raphael (eds.), Producing speech: Contemporary issues, 19–34. New York: AIP Press.

Chabot, Alex. 2019. What’s wrong with being a rhotic? Glossa: A Journal of General Linguistics 4(1): 38. 1–24. DOI:  http://doi.org/10.5334/gjgl.618

Cyran, Eugeniusz. 2014. Between phonology and phonetics: Polish voicing. Berlin: Mouton de Gryuter. DOI:  http://doi.org/10.1515/9781614515135

Dresher, B. Elan. 2009. The contrastive hierarchy in phonology. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511642005

Dresher, B. Elan, Glyne Piggot & Keren Rice. 1994. Contrast in Phonology: Overview. In Carrie Dyck (ed.), Toronto Working Papers in Linguistics 13. iii–xvii. Toronto: Department of Linguistics, University of Toronto.

Endresen, Rolf Thiel. 1991. Fonetikk og fonologi: Ei elementær innføring [Phonetics and phonology: An elementary introduction]. Oslo: Universitetsforlaget.

Göschel, Joachim. 1971. Artikulation und Distribution der sogenannten Liquida r in den europäischen Sprachen [Articulation and distribution of the so-called liquid r in the European Languages]. Indogermanische Forschungen [Indo-Germanic Studies] 76. 84–126.

Hale, Mark & Charles Reiss. 2008. The phonological enterprise. Oxford: Oxford University Press.

Hall, Daniel Currie. 2007. The role and representation of contrast in phonological theory. Toronto: University of Toronto dissertation.

Hall, Daniel Currie. 2011. Phonological contrast and phonetic enhancement: Dispersedness without dispersion. Phonology 28(1). 1–54. DOI:  http://doi.org/10.1017/S0952675711000029

Hall, Tracy Alan. 1997. The phonology of coronals. Amsterdam & Philadelphia: John Benjamins. DOI:  http://doi.org/10.1075/cilt.149

Halle, Morris, Bert Vaux & Andrew Wolfe. 2000. On feature spreading and the representation of place of articulation. Linguistic Inquiry 31(3). 387–444. DOI:  http://doi.org/10.1162/002438900554398

Hamann, Silke. 2003. The phonetics and phonology of retroflexes. Utrecht: Utrecht University dissertation.

Hayes, Bruce, Robert Kirchner & Donca Steriade (eds.). 2004. Phonetically based phonology. Cambridge: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511486401

Herd, Jonathan. 2005. Loanword adaptation and the evaluation of similarity. In Chiara Frigeni, Manami Hirayama, & Sara Mackenzie (eds.), Toronto Working Papers in Linguistics 24, 65–116. Toronto: Linguistics Graduate Course Union. (https://twpl.library.utoronto.ca/index.php/twpl/article/view/6195) (Accessed 2019-02-10).

Honeybone, Patrick. 2005. Diachronic evidence in segmental phonology: The case of obstruent laryngeal specifications. In Marc van Oostendorp & Jeroen van de Weijer (eds.), The internal organization of phonological segments, 317–352. Berlin: Mouton de Gruyter. DOI:  http://doi.org/10.1515/9783110890402.317

Howell, Robert B. 1991. Old English breaking and its Germanic analogues. Tübingen: Max Niemeyer. DOI:  http://doi.org/10.1515/9783111356501

Iverson, Gregory & Joseph Salmons. 1995. Aspiration and laryngeal representation in Germanic. Phonology 12(3). 369–396. DOI:  http://doi.org/10.1017/S0952675700002566

Iverson, Gregory & Joseph Salmons. 2003. Legacy specification in the laryngeal phonology of Dutch. Journal of Germanic Linguistics 15(1). 1–26. DOI:  http://doi.org/10.1017/S1470542703000242

Jacewicz, Ewa, Robert Allen Fox & Joseph Salmons. 2011. Cross-generational vowel change in American English. Language Variation and Change 23(1). 45–86. DOI:  http://doi.org/10.1017/S0954394510000219

Johannessen, Janne Bondi & Bert Vaux. 2013. Retroflex variation and methodological issues: A reply to Simonson, Moen, and Cowen (2008). Journal of Phonetics 41(1). 48–55. DOI:  http://doi.org/10.1016/j.wocn.2012.09.002

Kehrein, Wolfgang & Chris Golston. 2004. A prosodic theory of laryngeal contrasts. Phonology 21(3). 325–357. DOI:  http://doi.org/10.1017/S0952675704000302

Keyser, Samuel & Kenneth Stevens. 2006. Enhancement and overlap in the speech chain. Language 82(1). 33–63. DOI:  http://doi.org/10.1353/lan.2006.0051

Kristoffersen, Gjert. 2000. The phonology of Norwegian. Oxford: Oxford University Press.

Kwon, Joy. 2019. Korean vowel mergers: Contrastive hierarchies and distinctive features. University of Pennsylvania Working Papers in Linguistics 25(1). 159–168. (https://repository.upenn.edu/pwpl/vol25/iss1/18) (Accessed 2019-20-09).

Ladefoged, Peter & Ian Maddieson. 1996. The sounds of the world’s languages. Oxford: Blackwell.

Lahiri, Aditi & Henning Reetz. 2002. Underspecified recognition. In Carlos Gussenhoven & Natasha Warner (eds.), Laboratory Phonology 7, 637–676. Berlin: Mouton de Gruyter.

Lahiri, Aditi & Henning Reetz. 2010. Distinctive features: Phonological underspecification in representation and processing. Journal of Phonetics 38(1). 44–99. DOI:  http://doi.org/10.1016/j.wocn.2010.01.002

Lindau, Mona. 1985. The story of r. In V.A. Fromkin (ed.), Phonetic linguistics, 157–168. Orlando: Academic Press.

Lombardi, Linda. 1996. Postlexical rules and the status of privative features. Phonology 13(1). 1–38. DOI:  http://doi.org/10.1017/S0952675700000178

Magnuson, Thomas J. 2007. The story of /r/ in two vocal tracts. In Jürgen Trouvain & William J. Barry (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 1193–1196.

Mester, R. Armin & Junko Itô. 1989. Feature predictability and underspecification: Palatal prosody in Japanese mimetics. Language 65(2). 258—293. DOI:  http://doi.org/10.2307/415333

Natvig, David. 2018. Contrast, variation, and change in Norwegian vowel systems. Madison, WI: University of Wisconsin–Madison dissertation.

Natvig, David. 2019. Levels of representation in phonetic and phonological contact. In Jeroen Darquennes, Joseph Salmons & Wim Vandenbussche (eds.), Language contact: An international handbook (Handbücher zur Sprach- und Kommunikationswissenchaft/Handbooks of Linguistics and Communication Science 45) 1. 88–100. Berlin & Boston, MA: De Gruyter. DOI:  http://doi.org/10.1515/9783110435351-008

Natvig, David & Joseph Salmons. Forthcoming. Fully accepting variation in (pre)history: The pervasive heterogeneity of Germanic rhotics. Festschrift chapter.

Nevins, Andrew & Ioana Chitoran. 2008. Phonological representations and the variable patterning of glides. Lingua 118(12). 1979–1997. DOI:  http://doi.org/10.1016/j.lingua.2007.10.006

Oxford, Will. 2015. Patterns of contrast in phonological change: Evidence from Algonquian vowel systems. Language 91(2). 308–357. DOI:  http://doi.org/10.1353/lan.2015.0028

Papazian, Erik & Botolv Helleland. 2005. Norsk talemål: Lokal og sosial variasjon [Norwegian spoken language: Local and social variation]. Kristiansand: Høyskoleforlaget.

Proctor, Michael. 2011. Towards a gestural characterization of liquids: Evidence from Spanish and Russian. Laboratory Phonology 2(2). 451–485. DOI:  http://doi.org/10.1515/labphon.2011.017

Punnoose, Reenu, Ghada Khattab & Jalal Al-Tamimi. 2013. The contested fifth liquid in Malayalam: A window into the lateral-rhotic relationship in Dravidian languages. Phonetica 70(4). 274–297. DOI:  http://doi.org/10.1159/000356359

Purnell, Thomas. 2009. Phonetic influence on phonological operations. In Eric Raimy & Charles Cairns (eds.), Contemporary views on architecture and representations in phonology, 337–354. Cambridge, MA: MIT Press. DOI:  http://doi.org/10.7551/mitpress/9780262182706.003.0017

Purnell, Thomas & Eric Raimy. 2015. Distinctive features, levels of representation and historical phonology. In Patrick Honeybone & Joseph Salmons (eds.), The Oxford handbook of historical phonology, 522–544. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/oxfordhb/9780199232819.013.002

Purnell, Thomas, Eric Raimy & Joseph Salmons. 2019. Old English vowels: Diachrony, privativity, and phonological representations. Language, Research reports 94(4). e447–e473. DOI:  http://doi.org/10.1353/lan.2019.0083

Riad, Tomas. 2014. The phonology of Swedish. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199543571.001.0001

Rice, Keren. 1999. Featural markedness in phonology: Variation. Part 1. Glot International 4(7). 3–6. Part 2. Glot International 4(8). 3–7.

Rice, Keren. 2009. Nuancing markedness: a place for contrast. In Eric Raimy & Charles E. Cairnes (eds.), Contemporary views on architecture and representations in phonology, 311–321. Cambridge, MA: MIT Press. DOI:  http://doi.org/10.7551/mitpress/9780262182706.003.0015

Salmons, Joseph. 2018. A history of German: What the past reveals about today’s language. 2nd edn. Oxford: Oxford University Press.

Scheer, Tobias. 2014. Spell-out, post-phonological. In Eugeniusz Cyran & Jolarta Szpyra-Kozłowska (eds.), Crossing phonetics-phonology lines, 255–275. Newcastle upon Tyne: Cambridge Scholars.

Sebregts, Koen. 2014. The sociophonetics and phonology of Dutch r. Utrecht: Utrecht University dissertation.

Spahr, Christopher. 2016. Contrastive representations in non-segmental phonology. Toronto: University of Toronto dissertation.

Stausland Johnsen, Sverre. 2012. A diachronic account of phonological unnaturalness. Phonology 29(3). 505–531. DOI:  http://doi.org/10.1017/S0952675712000243

Stevens, Kenneth, Samuel Keyser & Haruko Kawasaki. 1986. Toward a phonetic and phonological theory of redundant features. In Joseph Perkell & Dannis Klatt (eds.), Invariance and variability in speech processes, 426–449. Hillsdale: Erlbaum.

Trubetzkoy, Nicolai S. 1969. Principles of phonology, translated by Christiane A. M. Baltaxe. Berkeley, CA: University of California Press.

Walsh Dickey, Laura. 1997. The phonology of liquids. Amherst, MA: University of Massachusetts Amherst dissertation.

Wiese, Richard. 2001a. The phonology of /r/. In Tracy Alan Hall (ed.), Distinctive feature theory, 335–368. Berlin: Mouton de Gruyter.

Wiese, Richard. 2001b. The unity and variation of (German) /r/. In Hans Van de Velde & Roeland van Hout (eds.), ‘r-atics: Sociolinguistic, phonetic and phonological characteristics of /r/, 11–26. Brussels: Etudes & Travaux.

Wiese, Richard. 2003. The unity and variation of (German) /r/. Zeitschrift für Dialektologie und Linguistik [Journal for Dialectology and Linguistics] 70. 25–43.

Wiese, Richard. 2011. The representation of rhotics. In Marc van Oostendorp, Colin J. Ewen, Elizabeth Hume & Keren Rice (eds.), The Blackwell companion to phonology, 711–729. Oxford: Blackwell. DOI:  http://doi.org/10.1002/9781444335262.wbctp0030

Youssef, Islam. 2019. The phonology and micro-typology of Arabic R. Glossa: A Journal of General Linguistics 4(1): 131. 1–36. DOI:  http://doi.org/10.5334/gjgl.1002