1 Introduction

In the study of individual languages or language families, it is often convenient to refer to concepts that may not have a clear correspondence with general, cross-linguistic grammatical categories. Examples include the consonantal root in Semitic languages, tone groups in Sinitic tone sandhi, the accentual phrase and intermediate phrase in Japanese, and pitch accent in languages like Japanese and Swedish. Some of these concepts have been successfully equated with cross-linguistic grammatical categories. For example, Mandarin Chinese tone groups, and the accentual and intermediate phrases of Japanese, have been analyzed as instantiations of units on a more universal prosodic hierarchy (Cheng 1987; Ito & Mester 2012).

Other language-specific grammatical concepts may involve the interplay of multiple, independent phenomena. For example, the consonantal root in Hebrew has been argued not to be a grammatical primitive, but instead to arise from the interaction of segmental affixation and prosodic constraints on word size (Bat-El 1994; 2003; Ussishkin 2005). Likewise, pitch accent is not a typological category of its own, but rather reflects the combination of lexical tone with properties like obligatoriness and culminativity (Hyman 2011).

Lastly, it may be that some language-specific grammatical concepts are truly sui generis, and cannot be related to more universal characteristics of human language.

Insofar as linguistic research aims to determine (i) whether the languages of the world all share a limited set of grammatical categories (Haspelmath 2010; Beck 2016), and (ii) what those categories might be, it is important to address whether such language-specific concepts and units are analytically necessary, or can instead be understood in more typologically-general terms.

One language-specific unit which features prominently in the study of the phonology of Mixtec languages is the couplet. The term couplet appears in nearly every piece of scholarly work that touches on the phonology of Mixtec languages. A non-exhaustive list of sources that use this term in describing sound patterns in Mixtec languages includes Pike (1948), Mak (1953), Longacre (1955), Pike & Cowan (1967), Pike & Small (1974), Pike & Oram (1976), North & Shields (1977), Josserand (1983), Marlett (1992), Macaulay & Salmons (1995), Gerfen (1996), Iverson & Salmons (1996), Macken & Salmons (1997), Daly & Hyman (2007), McKendry (2013), DiCanio et al. (2014), Herrera Zendejas (2014), Carroll (2015), Mendoza Ruiz (2016), León Vásquez (2017), Peters (2018), Becerra Roldán (2019), Peters & Mendoza (2020), Rueda Chaves (2021), Uchihara & Mendoza Ruiz (2022), and Belmar Viernes (2024), among many others. The nature of the couplet is treated at length in Penner (2019).

Roughly speaking, the couplet is used to describe the prototypical shape of lexical roots in Mixtec languages. Roots are generally bimoraic, consisting of two short vowels or one long vowel, and they tend to be no larger and no smaller (Josserand 1983: 460). Examples of these root shapes are given below for San Martín Peras Mixtec (SMP Mixtec), a variety of Mixtec in Josserand’s (1983) Southern Baja dialect group.1

Examples of the Mixtec couplet in SMP Mixtec

    1. (1)
    1. Saà
    2. [saà]
    1. ‘Bird/pájaro’
    1. (2)
    1. Sânà
    2. [sânã̀]
    1. ‘Crazy/loco’
    1. (3)
    1. Ìvì
    2. [ìβì]
    1. ‘Two/dos’
    1. (4)
    1. Iin
    2. [ĩĩ]
    1. ‘One/uno’

The couplet is the domain of many phonological patterns across Mixtec languages (Penner 2019; Rueda Chaves 2021). For example, the devoicing of approximants and pre-nasalized consonants occurs couplet-initially in some varieties (Becerra Roldán 2019), and the contrast between modal [V] and laryngealized [Vˀ] vowels is only made within the couplet (Macaulay & Salmons 1995). Additionally, tone sandhi patterns are often bounded by the couplet (e.g. Pike & Cowan 1967), and co-occurrence restrictions between oral and nasal segments often hold only within the couplet (Marlett 1992). It is easy to see, then, why so much of the Mixtec literature references the couplet as a crucial domain.

Despite its ubiquity and usefulness in describing phonological patterns, there is disagreement about how best to define the couplet. As described in Penner (2019), there are two main characterizations. Under the first approach, the couplet corresponds to the morphological root. This means that, broadly speaking, the phonological patterns described above are all root-bounded—they hold root-internally, but not necessarily across morpheme boundaries or in affixes. Proponents of the morphological analysis include Pike & Oram (1976); Marlett (1992); and Macaulay & Salmons (1995). The second main approach to defining the couplet equates it to a bimoraic foot. Assuming that only roots are footed, this means that the phonological patterns described above are foot-bounded—they hold or apply within the domain of a bimoraic foot, but not necessarily outside of the foot. Proponents of this prosodic analysis include Gerfen (1996); Macken & Salmons (1997); Carroll (2015); and Penner (2019). These two views are not always easily distinguishable. For example, proponents of the morphological analysis still recognize prosodic constraints placed on roots (e.g., Marlett 1992: 425–426, fn2). However, most researchers subscribe to the analysis that the couplet is defined morphologically or prosodically, but not both. These opposing viewpoints are summarized in Table 1 below.

Table 1: Opposing analyses of the Mixtec ‘couplet’.

Couplet = morphological root Couplet = bimoraic foot
The Mixtec couplet is representationally equivalent to the morphological root. Any property attributed to the couplet can be recast in terms of the root. The Mixtec couplet is representationally equivalent to a bimoraic foot. Any property attributed to the couplet can be recast in terms of the foot.

In this paper, we argue that the couplet, at least in SMP Mixtec, is not coextensive with a single grammatical representation. Instead, some phonological properties of SMP Mixtec that are attributable to the couplet are best understood in terms of a bimoraic foot, while others are best understood in terms of a morphological root. This means that, like the consonantal root or pitch accent, the SMP Mixtec couplet is shorthand for a constellation of properties that arise from the interaction of independent pieces of a language’s grammar, and is not necessarily a grammatical unit of its own. To make this point, we provide some necessary language background in §2, describe phonological patterns that make reference to the bimoraic foot in §3, and then describe patterns that make reference to the morphological root in §4. §5 concludes.

2 Background

SMP Mixtec is spoken by roughly 11,500 people in and around the municipality of San Martín Peras in the district of Juxtlahuaca in western Oaxaca, Mexico (Instituto Nacional de Estadística y Geografía 2020), as well as by diaspora communities throughout Mexico and the US, especially in the Californian towns of Oxnard, Santa Maria, Salinas, and Watsonville (Mendoza 2020). It is an Otomanguean language in the Eastern Otomanguean branch, Amuzgo-Mixtecan subgroup, and Mixtecan major subgroup (Campbell 2017).

2.1 Basic phonology

This section describes the phonological inventory of SMP Mixtec, with all generalizations drawn from Eischens & Hedding (2025). Table 2 lists the phonemic consonants. One notable feature is the contrast between plain and prenasalized stops and affricates, a common contrast in Mixtec languages (Marlett 1992; Iverson & Salmons 1996).

Table 2: Consonant phonemes in SMP Mixtec.

Bilabial Alveolar Palatalized alveolar Palatal Post-alveolar Velar Labio-velar Palatalized velar
Stop p mp t nt tj ntj k ŋk kw kj
Fricative s ʃ
Affricate ͡tsj n͡tsj ͡n͡
Nasal m n ɲ
Tap ɾ
Approximant β l j

SMP Mixtec has five phonemic oral vowels and three phonemic nasal vowels (Table 3). Important for our discussion in §4 is the fact that the language has oral mid vowels ([e] and [o]), but not nasal mid vowels ([ẽ] and [õ]). Restrictions on the distribution of mid vowels are common across Mixtec languages (Pike 1947: 168–169).

Table 3: Oral and nasal vowels in SMP Mixtec.

Front Mid Back
High i ĩ u ũ
Mid e o
Low a ã

Finally, in SMP Mixtec, vowels only contrast for nasality after the plain (non-prenasalized) stops, affricates, and fricatives. Everywhere else, their orality/nasality is predictable. We discuss these distributional patterns in depth in §4.

There are at least five contrastive tones in SMP Mixtec (Peters 2018). Three are level tones: High [V́], Mid [V], and Low [V̀]. These contrast with contour tones, which include a Low-to-High rise [V̌] and at least one falling tone [V̂]. The mora is the tone-bearing unit, meaning that any mora may bear any one of the language’s five contrastive tones. Contours of three tones or more cannot be hosted on a single mora – these are only found on bimoraic long vowels.

SMP Mixtec also contrasts modal, laryngealized, and breathy phonation on the penultimate mora of stems (though see Peters 2018 for a consonantal analysis of [ʔ] and [h]). We concur with Macaulay & Salmons (1995) that these are best analyzed as suprasegmental features of the morpheme, though this analysis is not crucial for the main points of this paper. These contrasts are discussed at length in §3.

SMP Mixtec has a series of pronominal clitics which typically follow their hosts. In §2.2 we discuss vocalic enclitics of the form /=V/. In §4.4 we argue that these enclitics are parsed into the same prosodic word as their hosts, and form a bimoraic foot with root material. In §3.1.2 we argue that larger, CV enclitics are instead external to the prosodic words of their hosts.2

2.2 Vocalic enclitics

SMP Mixtec features a number of pronominal enclitics that overwrite the final vowel of the stem to which they attach (Ostrove 2018; Peters & Mendoza 2020; Mendoza 2020; Hedding 2022). Because they feature prominently in §4, we spell out their relevant morphophonological behavior in detail here. (Larger, CV clitic pronouns are discussed in §3.1.2.)

Vocalic enclitic pronouns /=V/, which are used primarily as subject markers, markers of possession, and the objects of prepositions, are listed in Table 4 below:

Table 4: Vocalic enclitic pronouns in SMP Mixtec.

1sg 2sg 1pl.incl 3sg.neut
=ì =ṹ =é =à

We classify these morphemes as clitics because they are non-selective with respect to their hosts: they may directly follow, and ‘lean’ on verbs, adverbs, nouns (in possession), and prepositions (e.g. Macaulay 1987a; b; 2005).

Similar vocalic enclitics are found in other varieties, where they interact in various ways with stem-final vowels, including coalescence (Ixtayutla, Penner 2019: 217), diphthong and glide formation (Alcozauca, Uchihara & Mendoza Ruiz 2022: 626–627), the loss of a stem-final vowel (Alcozauca, Uchihara & Mendoza Ruiz 2022: 626–627; Huajuapan, Pike & Cowan 1967: 12–13), and in some cases, unrepaired vowel hiatus (Huajuapan, Pike & Cowan 1967: 13). Of these strategies, SMP Mixtec displays the complete loss of stem-final vowels (5) and the conversion of stem-final vowels into consonant off-glides (6).

    1. (5)
    1. a.
    1. Xíxi
    1. [ʃíʰʃi]
    2. cont.eat
    1. ‘Eats/come’
    1.  
    1. b.
    1. Xíxi ì
    1. [ʃíʰʃ=ì]
    2. cont.eat=1sg
    1. ‘I eat/como’
    1.  
    1. c.
    1. Xíxi ún
    1. [ʃíʰʃ=ṹ]
    2. cont.eat=2sg
    1. ‘You eat/comes’
    1.  
    1. d.
    1. Xíxi é
    1. [ʃíʰʃ=é]
    2. cont.eat=1pl.incl
    1. ‘We eat/comemos’
    1.  
    1. e.
    1. Xíxi à
    1. [ʃíʰʃ=à]
    2. cont.eat=3sg.n
    1. ‘They (non-sp.) eat/come’
    1. (6)
    1. a.
    1. Kìsi
    1. [kìʰsi]
    2. cooking.pot
    1. ‘Cooking pot/olla’
    1.  
    1. b.
    1. Kìsi ì
    1. [kìʰs=î]
    2. cooking.pot=1sg
    1. ‘My cooking pot/mi olla’
    1.  
    1. c.
    1. Kìsi ún
    1. [kìʰsʲ=ũ̌]
    2. cooking.pot=2sg
    1. ‘Your cooking pot/tu olla’
    1.  
    1. d.
    1. Kìsi é
    1. [kìʰsʲ=ě]
    2. cooking.pot=1pl.incl
    1. ‘Our cooking pot/nuestra olla’
    1.  
    1. e.
    1. Kìsi à
    1. [kìʰsʲ=â]
    2. cooking.pot=3sg.n
    1. ‘Their (non-sp.) cooking pot/su olla’

Whether the stem-final vowel is lost or becomes an off-glide depends on the identity of the stem-final vowel and the identity of the preceding consonant. For example, stem-final /i/ is lost after [ʃ], as in (5), but retained as an off-glide after [s], as in (6). Whether the stem-final vowel is lost or becomes an off-glide is also subject to interspeaker variation for some words. Even with these restrictions, it is the case that each enclitic is capable of completely overwriting stem-final vowels.

Vocalic enclitics also undergo tonal variation when overwriting stem-final vowels (Peters & Mendoza 2020). When a high-toned enclitic attaches to a low-final stem, the result is a low-to-high rise (7). If a low-toned enclitic combines with a high-tone-final stem (8) or a rise-final stem (9), then the result is a falling tone. Sometimes, contour tones are produced through the combination of an enclitic with a mid-tone-final stem, though this is lexically-determined (Peters & Mendoza 2020, cf. (5)–(6)).

    1. (7)
    1. a.
    1. Kòntò
    2. [kòⁿtò]
    1. ‘Knee/rodilla’
    1.  
    1. b.
    1. Kòntò é
    1. [kòⁿt=ě]
    2. knee=1pl.incl
    1. ‘Our knees/nuestras rodillas’
    1. (8)
    1. a.
    1. Xù’ún
    2. [ʃũ̀ˀṹ]
    1. ‘Money/dinero’
    1.  
    1. b.
    1. Xù’ún ì
    1. [ʃũ̀ˀ=ĩ̂]
    2. money=1sg
    1. ‘My money/mi dinero’
    1. (9)
    1. a.
    1. Sàtǎ
    2. [sàʰtǎ]
    1. ‘Back/espalda’
    1.  
    1. b.
    1. Sàtǎ ì
    1. [sàʰt=î]
    2. back=1sg
    1. ‘My back/mi espalda’

These enclitics also interact with the nasality of stem-final vowels, as summarized in Table 5. These patterns are the focus of §4, so we defer detailed discussion till that point.

Table 5: Vocalic pronominal enclitics in SMP Mixtec.

Pronoun Nasality
1sg [=ì] Takes on nasality/orality of final vowel of root
1pl.incl [=é] Always oral
2sg [=ṹ] Always nasal
3sg.n [=à] Alternates somewhat predictably in nasality

Similar patterns of nasality have been described for vocalic enclitics in other Mixtec languages, such as Huajuapan Mixtec (Pike & Cowan 1967), Ixtayutla Mixtec (Penner 2019), Yoloxochitl Mixtec (DiCanio et al. 2020), and San Martín Duraznos Mixtec (Auderset et al. 2024). In these varieties, it is common for the 2sg enclitic to be invariably nasal (Pike & Cowan 1967; North & Shields 1977; Gerfen 1996; Becerra Roldán 2019; Penner 2019; DiCanio et al. 2020; Auderset et al. 2024), for the 1pl.incl to be invariably oral (Pike & Cowan 1967; DiCanio et al. 2020; Auderset et al. 2024), and for the 1sg enclitic to preserve the nasality/orality of the base (Pike & Cowan 1967; Becerra Roldán 2019; Penner 2019; Auderset et al. 2024) (just as in Table 5). In other Mixtec languages, vocalic enclitics show uniform patterns of nasality. For example, all vocalic enclitics in Mixtepec Mixtec take on the nasality/orality of the final vowel of the root to which they attach (Belmar Viernes 2024).

3 The bimoraic foot

As discussed in §1, some researchers have argued that the couplet is coextensive with a bimoraic foot, and others that the couplet is best defined as a morphological root. In this section, we concur with and replicate the logic of Penner (2019) to show that some factors attributed to the couplet across Mixtec are best understood in terms of a bimoraic foot aligned to the right edge of the prosodic word.

The relevant evidence in SMP Mixtec comes from the distribution of [ʔ] and [h], as well as the distribution of rising tones. Specifically, [ʔ] and [h] may only occur in the middle of a bimoraic foot, and underived rising tones may occur foot-internally but not foot-externally. (On foot-sensitive phonotactics more generally, see e.g. Bennett 2012 and references there.)

3.1 The distribution of laryngeals

Across Mixtec languages, [ʔ] has a restricted distribution, which is often defined in terms of the couplet. For example, Josserand (1983: 228) writes that “[ʔ] occurs in couplet-medial position”, and Macaulay & Salmons (1995: 54) claim that “[ʔ] is…a feature of the couplet”. Because the distribution of [ʔ] is analyzed in terms of the couplet, this section outlines the behavior of laryngeals (both [ʔ] and [h]) in SMP Mixtec, arguing that their distributional characteristics are best understood in terms of a bimoraic foot instead of a morphological root. The argumentation broadly follows the lines of Penner (2019), who convincingly argues that the distribution of [ʔ] in Ixtayutla Mixtec is best defined in terms of a bimoraic foot.

Our argumentation relies on a non-isomorphism between the morphological root and the bimoraic foot in long roots. Given that most roots in SMP Mixtec are bimoraic, the root and bimoraic foot are usually coextensive. However, there are also a number of roots in SMP Mixtec which are trimoraic or larger. Many of these roots contain prefixes that are no longer productive, or are fossilized compounds. Examples of such constructions can be seen in (10)–(13).3

    1. (10)
    1. Tsìkiva
    2. [͡tsìkiβa]
    1. ‘Butterfly/mariposa’
    1. (11)
    1. Nakava
    2. [nãkaβa]
    1. ‘Will fall/se caerá’
    1. (12)
    1. Kachîñu
    2. [ka͡tʃîɲũ]
    1. ‘Will work/trabajará’
    1. (13)
    1. Tanta’ǎ
    2. [taⁿtaˀǎ]
    1. ‘Will marry/se casará’

The semantic contribution of most fossilized prefixes in SMP Mixtec, such as those in (10)–(11), is no longer clear. For example, the [͡tsì-] in (10) likely descends from a classifer used for animals and round things, since similar prenominal classifiers are productive in other varieties (e.g. de León Pasquel 1988: 142 on Coatzoquitengo Mixtec). In SMP Mixtec, [͡tsì-] is used in many animal names, but also with some plants (e.g. [͡tsìkʷàʰǎ], ‘orange’) and adjectives (e.g. [͡tsìkàβà], ‘crooked’). Additionally, it appears in some words in which it is not clearly a prefix but is instead integrated into the bimoraic root, such as in [͡tsìnã̀] ‘dog’ and [͡tsĩ̀ĩ́] ‘mouse’. Finally, it has a phonologically distinct form in some words. For example, it is [͡tʃì-] in [͡tʃìkʷii] ‘fox’. The [nã-] in (11) may derive from a ‘repetitive’ marker described as unproductive in other Mixtec varieties (e.g. Alexander 1988: 245–246; Zylstra 1991: 103–104). It is similarly unproductive in SMP Mixtec, where repetition of an action is signaled by the post-verbal adverb [tùʰkù] ‘again’.

The forms in (12) and (13) may derive historically from compounds, though they are now lexicalized. For example, the [ka-] in (12) may be a compound of [kaʰsa] ‘will do’ and [͡tʃiɲũ] ‘work (n)’, though such CV truncation in compounds is not synchronically productive. Finally, the [ta-] in (13) is of uncertain origin, though the full form likely derived from a compound containing the word [ⁿtaˀǎ] ‘hand’. Because of the non-productivity and opaque meaning of such prefixes and compounds, a number of scholars maintain that at least some words that are trimoraic or larger, like those in (10)–(13), are synchronically mono-morphemic, with the antepenultimate syllable now analyzed as a part of the root (e.g. Penner 2019; DiCanio et al. 2020).

Roots of this shape allow for the morphological and prosodic analyses of the Mixtec couplet to be tested against each other, because each analysis makes different predictions (Table 6). If the couplet corresponds to the morphological root, then generalizations that hold of the couplet should hold of both bimoraic and trimoraic roots. However, if the couplet corresponds to the bimoraic foot, then there should be an asymmetry between bimoraic and trimoraic roots, given that the foot may only include two morae. For reasons which will become clear in this section, we follow Penner (2019: 211) in analyzing the foot as aligned to the right edge of the prosodic word. Under this assumption, there is a non-isomorphism between the root (enclosed in vertical bars) and the foot (enclosed in parentheses) in trimoraic roots, as seen in (14) vs. (15). This non-isomorphism leads to the predictions in Table 6.

    1. (14)
    1. Bimoraic root
    1. Nta’ǎ
    2. Rt| Ft(ⁿtaˀǎ) |
    1. ‘Hand/mano’
    1. (15)
    1. Trimoraic root
    1. Tanta’ǎ
    2. Rt| ta Ft(ⁿtaˀǎ) |
    1. ‘Will marry/se casará’

Table 6: Predictions of opposing views of the Mixtec ‘couplet’.

Couplet = morphological root Couplet = bimoraic foot
Bimoraic and trimoraic roots have same properties Bimoraic roots have the same properties as the final two morae of trimoraic roots

3.1.1 Contrastive [ʔ] and [h]

As discussed in §2, SMP Mixtec has both contrastive [ʔ] and contrastive [h], which are analyzable as contrastive, non-modal phonation (Eischens & Hedding 2025). In bimoraic words, these may only occur following the first mora, as shown in (16) for [ʔ] and (17) for [h]. Hypothetical roots that begin with underlying /ʔ/ or /h/ or that end with [ʔ] or [h] are nearly unattested.4

    1. (16)
    1. a.
    1. Ntsí’i
    2. [ⁿ͡tsíˀi]
    1. ‘Blue/azul’
    1.  
    1. b.
    1. Sì’và
    2. [sìˀβà]
    1. ‘Seed/semilla’
    1. (17)
    1. a.
    1. Ntsíjǐ
    2. [ⁿ͡tsîʰǐ]
    1. ‘Sunny/soleado’
    1.  
    1. b.
    1. Ntsìjvǐ
    2. [ⁿ͡tsìʰβǐ]
    1. ‘Egg/huevo’

When considering trimoraic roots, however, the generalization is that contrastive [ʔ] and [h] may only occur following the penultimate mora, whether that mora is followed by another vowel or by a consonant. This can be seen in (18) for [ʔ] and in (19) for [h].

    1. (18)
    1. a.
    1. Tsìmá’à
    2. [͡tsìmã́ˀã̀]
    1. ‘Raccoon/mapache’
    1.  
    1. b.
    1. Ikǒ’yò
    2. [ikǒˀjò]
    1. ‘Mexico City/México’
    1. (19)
    1. a.
    1. Ntakòjo
    2. [ⁿtakòʰo]
    1. ‘Will get up/se levantará’
    1.  
    1. b.
    1. Natìjvi
    2. [nãtìʰβi]
    1. ‘Will appear/aparecerá’

Hypothetical roots in which [ʔ] and/or [h] occur following the antepenultimate or final mora are unattested, as shown in (20) and (21).

    1. (20)
    1. Unattested word shapes (formas imposibles)
    1.  
    1. a.
    1. *Tsi’maa
    2. *[͡tsìˀmãã]
    1.  
    1. b.
    1. *Tsimaa’
    2. *[͡tsimããˀ]
    1. (21)
    1. Unattested word shapes (formas imposibles)
    1.  
    1. a.
    1. *Najyava
    2. *[nãʰjaβa]
    1.  
    1. b.
    1. *Nayavaj
    2. *[nãjaβaʰ]

Finally, no enclitic pronouns contain [ʔ] or [h], whether they incorporate into the stem, or occur to its right. Only independent pronouns contain a laryngeal, namely [ʔ], which follows the penultimate mora (e.g. [jùˀù] ‘1sg’). The distribution of [ʔ] and contrastive [h] can be cleanly stated in prosodic terms: [ʔ] and contrastive [h] may only occur following the penultimate mora of a root, i.e. within a right-aligned, bimoraic foot, Rt| …Ft(μμ) |.

While this pattern is exceptionless in the lexicon, there are no synchronic alternations showing the addition of [ʔ] or [h] in foot-medial position, or deletion of [ʔ] or [h] outside of foot-medial position. We assume that this reflects the fact that there are no morphological or phonological operations that shift word-level foot structure away from the right edge of the prosodic word in SMP Mixtec.

3.1.2 Non-contrastive [h]

In addition to contrastive [h], SMP Mixtec also has a predictable, non-contrastive [h]. In bimoraic roots, this [h] predictably precedes root-medial voiceless obstruents, as seen in (22). Voiceless consonants are not preceded by [h] anywhere else. For example, word-initial voiceless consonants are not preceded by [h], even when following vowels in connected speech (23).

    1. (22)
    1. a.
    1. Sàtǎ
    2. [sàʰtǎ]
    1. ‘Back/espalda’
    1.  
    1. b.
    1. Kuíkà
    2. [kʷíʰkà]
    1. ‘Rich/rico’
    1.  
    1. c.
    1. Leso
    2. [leʰso]
    1. ‘Rabbit/conejo’
    1. (23)
    1. a.
    1. Iin sàtǎ
    1. [ĩĩ sàʰtǎ]
    2. one back
    1. ‘A back/una espalda’
    1.  
    1. b.
    1. Rà tsiàja kuíkà
    1. [ɾà ͡tsʲàʰa kʷíʰkà]
    2. 3m male rich
    1. ‘The rich man/el hombre rico’

Additionally, the initial [t] of the clitic pronoun tún ([=tṹ], ‘wood noun class enclitic’) is not preceded by [h] when it attaches to a root.

    1. (24)
    1. Tá’vi tún
    1. [táˀβi=tṹ]
    2. broken=3wd
    1. ‘It (car) is broken/está roto (el carro)’

This and other CV clitics presumably attach outside of the domain of the prosodic word. Evidence for this comes from the fact that CV clitics must attach to a foot-sized element in order to surface.

CV clitics may not occur in isolation, Instead, they must co-occur with a separate, bimoraic word. For example, the 3m pronoun [ɾà] is ungrammatical as a fragment answer to a question like ‘Who is in the photo?’ (25), but is grammatical when it co-occurs with a demonstrative (26).

    1. (25)
    1. *Rà
    2.   [ɾà]
    1.   Intended: ‘Him/él’
    1. (26)
    1. Rà káa
    1. [ɾà=káa]
    2. 3m=dem
    1. ‘Him/él’

CV clitics also may not surface with another monomoraic item, such as the negative nominal marker [nĩ] (27). Instead, they must still co-occur with a bimoraic item (28).

    1. (27)
    1. *Ni rà
    1.   [nĩ=ɾà]
    2.   not=3m
    1.   Intended: ‘Not even him/ni él’
    1. (28)
    1. Ni rà káa
    1. [nĩ=ɾà=káa]
    2. not=3m=dem
    1. ‘Not even him/ni él’

It appears that even when two CV clitics co-occur—in principle meeting the bimoraic word requirement—they cannot surface because they must be external to the prosodic word (which is itself minimally bimoraic). Because each utterance must contain a prosodic word, an utterance like (27) is ungrammatical. When there is a separate morpheme that can form its own prosodic word, as in (28), the utterance is grammatical.

In bimoraic roots, then, non-contrastive [h] may only follow the penultimate mora, just like contrastive [ʔ] and [h]. Similarly, non-contrastive [h] never follows root-medial, antepenultimate morae, regardless of the fossilized prefix or compound element involved. This is most apparent in trimoraic roots which contain multiple voiceless consonants. In words of this shape, only the voiceless consonant following the penultimate mora is preceded by [h]; the voiceless consonant following the antepenult is not (29)–(31).

    1. (29)
    1. Nakatsiǎ
    2. [nãkaʰ͡tsʲǎ]
    1. ‘Will wash/lavará’
    1. (30)
    1. Yùtǎtá
    2. [jùtǎʰtá]
    1. ‘Mirror/espejo’
    1. (31)
    1. Ixìko
    2. [iʃìʰko]
    1. ‘Will sell/venderá’

As with contrastive [ʔ] and contrastive [h], the restricted distribution of non-contrastive [h] in SMP Mixtec can be characterized as occurring only after the penultimate mora in a prosodic word, and thus foot-internally, Rt| …Ft(μμ) |.

Finally, there is evidence from code-switching that constraints on the distribution of non-contrastive [h] are synchronically active. Specifically, when a speaker code-switches into Spanish during a Mixtec utterance, voiceless consonants following the penultimate mora are often preceded by [h] (32), while voiceless consonants elsewhere are never preceded by [h] (33).

    1. (32)
    1. Brecha
    2. [bɾéʰ͡tʃà]
    1. ‘Road/brecha’
    1. (33)
    1. Máquina
    2. [mákinã̀]
    1. ‘Machine/máquina’

3.1.3 Interim summary

In SMP Mixtec, all laryngeals (except epenthetic [ʔ]) share the same restriction, whether they are contrastive or not: they may only follow the penultimate mora in a prosodic word, Rt| …Ft(μμ)|.

Because at least some trimoraic words are monomorphemic, the morphological view of the couplet is insufficient in capturing the distribution of laryngeals because it has no way to to draw a distinction between the word in (34) and the word in (35).

    1. (34)
    1. Bimoraic root
    1. Nta’ǎ
    2. Rt| ⁿtaˀǎ |
    1. ‘Hand/mano’
    1. (35)
    1. Trimoraic root
    1. Tanta’ǎ
    2. Rt| taⁿtaˀǎ |
    1. ‘Will marry/se casará’

On the other hand, prosodic structure provides a convenient means to capture the pattern at hand. If SMP Mixtec words contain a bimoraic foot aligned to the right edge of a prosodic word, then the generalization is clear: laryngeals may only follow the initial mora in a foot, Rt| …Ft(μμ) |. Antepenultimate and preantepenultimate morae are foot-external and therefore incapable of being followed by [ʔ] and [h]. Final morae are foot-final, not foot-internal, and therefore cannot occur with [ʔ] and [h] either.

    1. (36)
    1. Bimoraic root
    1. Nta’ǎ
    2. Ft(ⁿtaˀǎ)
    1. ‘Hand/mano’
    1. (37)
    1. Trimoraic root
    1. Tanta’ǎ
    2. ta Ft(ⁿtaˀǎ)
    1. ‘Will marry/se casará’

The inadequacy of the root-based definition of the couplet and the success of the foot-based definition both provide evidence in favor of the view that some characteristics traditionally attributed to the couplet are best defined in terms of a bimoraic foot rather than a morphological root, as argued in Penner (2019).

However, a foot-based analysis is not the only one capable of capturing the distribution of laryngeals. One might appeal to stress, contending that the penultimate mora of a root is stressed, and that [ʔ] and [h] may only occur in stressed syllables, as opposed to foot-medially (though see Macaulay & Salmons 1995 for an argument that [ʔ] is not restricted to the position of stress in all Mixtec languages). Since stress may be analyzed without recourse to feet (Prince 1983; Gordon 2002), the data discussed so far are not uniquely analyzable as a property of the foot. Instead, they are simply more amenable to a foot-based analysis than a root-based analysis. To show that there is independent evidence for the foot (and not just stress) in SMP Mixtec, we now turn to the distribution of underlying rising tones.

3.2 Rising tones

Tonal phonotactics, and tonal melodies in particular, are frequently described for Mixtec languages in terms of the couplet. In fact, Pike’s (1948) original coining of the term tonemic couplet was in the service of describing tonal melodies in a Mixtec language. Since tonal distributions are often analyzed in terms of the couplet, we describe one aspect of tonal phonotactics that points to the foot as an active unit in SMP Mixtec phonology. In particular, we show that underlying rising tones may only surface foot-internally.5 That is, rising tones may be found on either mora of the bimoraic foot, but not on material that precedes or follows the foot.

Many words in SMP Mixtec contain monomoraic rising tones. These are most common on the final mora of the word (38), but there are many bimoraic words with an initial rising tone (39). For readability, we include the tonal melodies in parentheses next to the IPA transcriptions in this section.

    1. (38)
    1. a.
    1. Ká’nǐ
    2. [káˀnĩ̌] (H-LH)
    1. ‘Fever/fiebre’
    1.  
    1. b.
    1. Sa’mǎ
    2. [saˀmã̌] (M-LH)
    1. ‘Embroidered cloth/servilleta’
    1.  
    1. c.
    1. Sòkǔn
    2. [sòʰkũ̌] (L-LH)
    1. ‘Neck/cuello’
    1. (39)
    1. a.
    1. Xǐyò
    2. [ʃǐjò] (LH-L)
    1. ‘Dress/vestido’
    1.  
    1. b.
    1. Sǎnǐ
    2. [sǎnĩ̌] (LH-LH)
    1. ‘Corn cob/olote’
    1.  
    1. c.
    1. Mǎ’nà
    2. [mã̌ˀnã̀] (LH-L)
    1. ‘Sleepless/desvelado’

Rising tones are phonologically distinct from low (40), mid (41), and high (42) tones, with which they contrast.

    1. (40)
    1. a.
    1. Ñu’ù
    2. [ɲũˀũ̀] (M-L)
    1. ‘Light/luz’
    1.  
    1. b.
    1. Ñu’ǔ
    2. [ɲũˀũ̌] (M-LH)
    1. ‘Ground/tierra’
    1. (41)
    1. a.
    1. Xiyò
    2. [ʃijò] (M-L)
    1. ‘Side/lado’
    1.  
    1. b.
    1. Xǐyò
    2. [ʃǐjò] (LH-L)
    1. ‘Dress/vestido’
    1. (42)
    1. a.
    1. Ntsí’i
    2. [ⁿ͡tsíˀi] (H-M)
    1. ‘Blue/azul’
    1.  
    1. b.
    1. Ntsǐ’i
    2. [ⁿ͡tsǐˀi] (LH-M)
    1. ‘Muscular/musculoso’

Despite their ubiquity, rising tones have a restricted distribution in SMP Mixtec. Specifically, underlying rising tones are only ever found on the penultimate or final mora of a prosodic word. That is, there are no underlying rising tones on fossilized prefixes, and neither are there rising tones on pronominal enclitics.

This pattern means that there are no trimoraic or quadrimoraic roots in which a rising tone occurs on the antepenult or preantepenult. This is in spite of there being many trimoraic and larger roots in which rising tones occur within the right-aligned foot, as shown in (43).

    1. (43)
    1. Tsìkuǐì
    2. [͡tsìkʷǐì] (L-LH-L)
    1. ‘Water/agua’
    1. (44)
    1. Ixǎni
    2. [iʃǎnĩ] (M-LH-M)
    1. ‘Will dream/soñará’
    1. (45)
    1. Tsìkantsìjǐ
    2. [͡tsìkaⁿ͡tsìʰǐ] (L-M-L-LH)
    1. ‘Sun/sol’

Additionally, CV clitic pronouns that occur to the right of the root do not host rising tones or falling tones – they only host level tones. This is in spite of the fact that some clitic pronouns have independent counterparts with contour melodies. For example, (46a) shows a independent pronoun with a H-L melody, whose dependent form has only a H tone (46b) instead of an HL contour tone, though falling tones are allowed root-internally (Peters 2018; Eischens & Hedding 2025). This tonal distinction between independent and dependent pronoun pairs suggests a pressure against contour tones on dependent pronouns. This can be interpreted either as contour tones being restricted to the foot, or as contour tones being restricted to the domain of the prosodic word, which we have argued CV clitics fall outside of.

    1. (46)
    1. a.
    1. Ntó’ò
    1. [ⁿtóˀò] (H-L)
    2. 2pl.ind
    1. ‘You all/ustedes’
    1.  
    1. b.
    1. Ntó
    1. [ⁿtó] (H)
    2. 2pl.dep
    1. ‘You all/ustedes’

So, in a phrase like (47), underlying rising tones may only occur in the bimoraic foot, which is in parentheses and excludes the CV clitic pronoun.

    1. (47)
    1. Kutuntosǒ ɾà
    1. [kutu(ⁿtoʰsǒ)=ɾà] (M-M-M-LH=L)
    2. pot.try=3m
    1. ‘He will try/él intentará’

As was the case for laryngeals, the restricted distribution of rising tones is straightforwardly accounted for by a generalization in terms of a right-aligned, bimoraic foot Rt| …Ft(μμ) |: underlying rising tones only occur foot-internally, not foot-externally. Under the assumption that at least some words that are trimoraic or larger constitute a single root, the absence of any underlying rising tones on the antepenult or preantepenult in such words makes clear that the root is not an especially useful unit for describing the distribution of underlying rising tones.

Additionally, it is worth pointing out that the distribution of laryngeals was potentially analyzable either as a property of the foot or of stressed morae. The distribution of rising tones cannot be accounted for by making reference to stressed morae — underlying rising tones are licensed on both morae in the foot, not just the stressed one (whether feet are taken to be trochaic or iambic in SMP Mixtec).

3.3 Summary

In this section, we have described two phonotactic patterns in SMP Mixtec that have been analyzed as properties of the couplet in other Mixtec languages. These are the distribution of laryngeals and rising tones.

Using the logic laid out in Penner (2019), we have argued that both of these phonological patterns point to the existence of a bimoraic foot aligned to the right edge of a prosodic word. That is, when considering roots that are trimoraic or larger, it is clear that the final two morae have different phonological properties from preceding material. Since all of this phonological material is contained within the same root, the generalizations in question cannot be accounted for by appealing to morphological structure. Instead, both generalizations are straightforwardly accounted for by assuming the existence of a bimoraic foot that is right-aligned to the prosodic word.

In this sense, these results align with the arguments in Penner (2019) that the bimoraic foot is representationally equivalent to the cross-Mixtec couplet. Since the couplet has been used to demarcate the distribution of laryngeals and tonal melodies in Mixtec languages (Macaulay & Salmons 1995; Daly & Hyman 2007), and since the foot can be used in the same way to account for the same generalizations, then an analysis of the couplet as coextensive with a bimoraic foot is appropriate.

However, this is not the whole story: patterns of co-occurrence between oral and nasal segments have also been analyzed as a property of the Mixtec couplet, but these patterns are not compatible with a solely prosodic analysis. Instead, any account of the data refer directly to morphological structure, at least in SMP Mixtec. We turn to these facts in the next section.

4 The morphological root

In this section, we show that some phonotactic constraints on the sequencing of nasal and non-nasal segments are exceptionless root-internally, but violated across morpheme boundaries. Importantly, morpheme boundaries and foot boundaries do not always line up in SMP Mixtec, allowing us to tease apart morphological and prosodic structure.

4.1 Representative nasal air pressure data

In this section, we illustrate patterns of vowel and consonant nasality with nasal air pressure data from two speakers of SMP Mixtec (see Gerfen 1996 and Herrera Zendejas 2014 for previous nasal airflow studies on Mixtec languages). Oral and nasal air pressure recordings were made with Glottal Enterprises (GE) PT-2E pressure transducers, mounted on an oro-nasal mask. The GE oro-nasal mask has separate oral and nasal chambers, which allows oral and nasal air pressure to be recorded more-or-less independently. The oral pressure transducer was mounted directly on the mask, and the nasal pressure transducer was mounted on a GE DRTH-1 handle connected to the mask.

The air pressure transducers were connected to a GE MS-110 pressure transducer unit with a GE BFC-2 cable. For both speakers, the air pressure recordings were made at 11,025 Hz using Audacity. The amplitude of the nasal and oral air pressure channels were normalized to [0,1] for each speaker separately. All data processing, including normalization, was carried out with custom scripts in Praat and R (R Core Team 2013; Boersma & Weenink 2020).

Our measure of vowel nasality is ‘nasalance’. Nasalance is calculated by dividing the amplitude of pressure in the nasal chamber by the sum of the amplitude of pressure in the oral and nasal chambers (Anasal / (Anasal + Aoral)). Nasalance thus ranges from 0 to 1, is lower in oral sounds, and is higher in nasal sounds.6

Nasalance was calculated by low-pass filtering the absolute value of the normalized oral and nasal air pressure signals from [0, 280]Hz, smoothing the filtered signals with an 11ms averaging window centered on each measurement point, then taking the ratio An / (An + Ao) at each point. Nasalance was set to N/A for any point with nasal air pressure at or below 1% of the maximum for each speaker.

In plots, we only show nasalance for vowels, nasal consonants, and [ʔ] and [h], which may be contextually nasalized in SMP Mixtec. A representative token of the amplitudes of individual chambers, as well as nasalance, is given in Figure 1. The first row shows the normalized amplitude of air pressure from the oral cavity, the second shows the normalized amplitude of air pressure from the nasal cavity, and the third shows low-pass filtered nasalance. Note that the increased nasal air pressure in the second syllable in Figure 1 corresponds to increased nasalance on nasal [ĩ] vs. oral [i] in the first syllable. For reasons of space, we only present low-pass filtered nasalance in subsequent figures, omitting the original oral and nasal pressure channels. In all nasalance plots, numbers below segmental transcriptions indicate duration in ms.7

Figure 1: Oral air pressure, nasal air pressure, and low-pass filtered nasalance during chìxin [͡tʃìʰʃĩ] ‘stomach/estómago’ (Speaker RDC).

We collected data from two participants, each of whom produced target words from a wordlist embedded in the carrier phrase in (48).

    1. (48)
    1. Só’o káchi é __ tù’un ntá’vi
    1. [sóˀo
    2. like.this
    1. káʰ͡tʃ=ě
    2. cont.say=1pl.incl
    1. __
    2. __
    1. tũ̀ˀũ
    2. word
    1. ⁿtáˀβi]
    2. Indigenous
    1. ‘This is how we say ‘__’ in Mixtec/Así decimos ‘__’ en mixteco’

Speaker NGC produced 105 items, with 4 repetitions each (420 total productions). Speaker RDC produced a 70-item subset of the wordlist produced by NGC, with 2 repetitions each (140 total productions). Summary plots like Figure 2 also include tokens from a separate dataset in which speaker RDC produced 84 morphologically-simplex items, with 3 repetitions each (252 total productions). This extra dataset was included to increase the number of tokens used to calculate baseline nasalance levels for oral and nasal vowels. Full wordlists are included in the supplementary materials, along with nasalance plots for each recorded token.

Nasalance levels may vary by vowel height. In particular, nasal [ã] has markedly lower nasalance in our data than nasal [ĩ] or [ũ]. This is clear in Figure 2, which shows a summary of nasalance across vowel qualities for contrastively oral /V/ and nasal /Ṽ/. The weaker nasalance for nasal /ã/ compared to /ĩ ũ/ may be a result of a wider aperture in the oral cavity for low vowels, leading to increased oral airflow, relative to nasal airflow (e.g. Young et al. 2001).

Figure 2: Mean nasalance across vowel qualities for contrastively oral /V/ (left panel) and nasal /Ṽ/ (right panel). Measurements taken from root-internal vowels which were not adjacent to any nasal consonant. Measurements reflect middle 40–75% of each vowel (steps 7–13 of 17 total). Circles indicate means, bars ±1 standard deviation. Dashed red lines and surrounding bands indicate mean nasalance for contrastively oral /V/ and nasal /Ṽ/ across all vowel qualities, and ±1 standard deviation. Numbers at bottom of plot indicate observations per condition.

4.2 The phonotactics of nasality

A contrast between oral and nasal vowels is a consistent characteristic of Mixtec languages (Rueda Chaves 2021). As is common in languages with such a contrast, oral and nasal vowels are not contrastive in every phonological context. For example, vowels following nasal consonants are usually obligatorily nasal (Pike & Cowan 1967; Zylstra 1980; Marlett 1992; Gerfen 1996; León Vásquez 2017), as shown for SMP Mixtec in (49).

    1. (49)
    1. Ñuù
    2. [ɲũũ̀]
    1. ‘Town/pueblo’

Phonotactic restrictions on nasality have often been cast in terms of the couplet in the Mixtec literature. There are many examples of this, and we list a few here: Pike & Cowan (1967: 5) write that “If the first of two vowels is nasal in a monomorphemic couplet, the second vowel is usually nasal”. Hunter & Pike (1969: 30) say, “Allophonic nasalization of vowels is best described in relation to the couplet”. North & Shields (1977: 28) state that “Nasal vowels not preceding and following /m n ñ/ are restricted in their distribution in the couplet”. Zylstra (1980: 21) notes that “Nasalization of vowels extends through couplets of the form CVV and CVʔV where the vowels are identical”. Gerfen (1999: 260) says of nasalization that “a rule will associate [nasal] to the rightmost vowel of a couplet, while another [rule] will subsequently spread [nasal] to the left”. Paster & Beam de Azcona (2004: 67) write that, in the Yucunany dialect of Mixtepec Mixtec, “In couplets, vowel nasalization usually occurs in both syllables or neither”. Rueda Chaves (2019: xv) treats nasalization as a characteristic feature of couplet boundaries (original: “se abordan la nasalización y la glotalización como rasgos característicos de los lindes del couplet”). For Penner (2019: 265), “The domain of nasal phonotactics in [Ixtayutla Mixtec] is the couplet”. Finally, after defining the couplet as a bimoraic minimal word, Becerra Roldán (2023: 75) writes, “Nasal morphemes have a floating [+nasal] feature that associates to the right boundary of the minimal word and spreads leftward to sonorants” (original: “Los [morfemas nasales] poseen un rasgo flotante [+nasal] que se asocia al linde derecho de la palabra mínima y se propaga a la izquierda a segmentos resonantes”).

As these quotes show, patterns of nasalization are intimately tied to the notion of the couplet. Because of this, nasalization is another appropriate domain for testing the two main hypotheses about the nature of the couplet.

We argue in this section that restrictions on nasal and oral segments are, contrary to the results in §3, best analyzed in terms of the morphological root, and not the foot. As we show, restrictions on nasality hold exceptionlessly within roots, but not necessarily within feet, pointing to the root as a correspondent of the Mixtec couplet.

4.3 Restrictions in mono-morphemic words

This section describes the phonotactic restrictions that hold on sequences of nasal and non-nasal segments in mono-morphemic roots, focusing on three exceptionless restrictions, which are common across Mixtec languages (Marlett 1992; Rueda Chaves 2021).8

The first restriction is that vowels in mono-morphemic roots are obligatorily oral when following approximants like [β], [l], and [j], as shown in (50)–(51) and Figure 3, where nasalance values are within the range for oral vowels established in Figure 2.

    1. (50)
    1. Yivà
    2. [jiβà]
    1. ‘Plant/hierba’
    1. (51)
    1. Tsiâyì
    2. [͡tsʲâjì]
    1. ‘Chair/silla’

Figure 3: [jiβà] (50) (Speaker RDC).

Additionally, vowels in roots are obligatorily oral when following prenasalized consonants, as shown in (52)–(55), and illustrated for (52) in Figure 4.

    1. (52)
    1. Kòntò
    2. [kòⁿtò]
    1. ‘Knee/rodilla’
    1. (53)
    1. Ntsìì
    2. [ⁿ͡tsìì]
    1. ‘Dead/muerto’
    1. (54)
    1. Chi’nki
    2. [͡tʃiˀⁿki]
    1. ‘Acorn/bellota’
    1. (55)
    1. Chínchi
    2. [͡tʃíⁿ͡tʃi]
    1. ‘Cricket/grillo’

Figure 4: [kòⁿtò] (52) (Speaker NGC).

The second restriction, noted at the beginning of this section, is that vowels following nasal consonants are obligatorily nasal in roots.9 This is exemplified in (56)–(57) and Figure 5 (recall that nasal [ã] has lower nasalance than other nasal vowels; Figure 2).

    1. (56)
    1. Ñani
    2. [ɲãnĩ]
    1. ‘Brother (of a man)/hermano (de hombre)’
    1. (57)
    1. Ñǔù
    2. [ɲũ̌ũ̀]
    1. ‘Night/noche’

Figure 5: [ɲãnĩ] (56) (Speaker NGC).

The third and final restriction discussed here concerns adjacent vocalic morae. Specifically, sequences of adjacent vowels must match in nasality or orality within roots (58). The same is also true of vowels separated by [ʔ] or [h], as exemplified in (59)–(60). This is illustrated using air pressure data for (58) in Figure 6, and for (60) in Figures 7, 8. Note especially the nasalization of [h] in Figure 8, which suggests the spread of nasality between the two vowels.10

    1. (58)
    1. a.
    1. Kuáà
    2. [kʷáà]
    1. ‘Blind/ciego’
    1.  
    1. b.
    1. Kuáàn
    2. [kʷã́ã̀]
    1. ‘Yellow/amarillo’
    1. (59)
    1. a.
    1. Kuá’à
    2. [kʷáˀà]
    1. ‘Red/rojo’
    1.  
    1. b.
    1. Kuá’àn
    2. [kʷã́ˀã̀]
    1. ‘Goǃ/veteǃ’
    1. (60)
    1. a.
    1. Tsìkuàjǎ
    2. [͡tsi̥kʷàhǎ]
    1. ‘Orange (n)/naranja’
    1.  
    1. b.
    1. Kuájǎn
    2. [kʷã́ʰã̌]
    1. ‘Unmarried/soltero’

Figure 6: [kʷáà] (58a) vs. [kʷã́ã̀] (58b) (Speaker NGC).

Figure 7: [͡tsi̥kʷàhǎ] (60a) (Speaker NGC).

Figure 8: [kʷã́ʰã̌] (60b) (Speaker NGC).

These three generalizations, which are listed together in Table 7, hold exceptionlessly within monomorphemic forms.

Table 7: Nasal-oral co-occurrence restrictions.

Restriction 1 Restriction 2 Restriction 3
Vowels are oral after ⁿCs and approximants Vowels are nasal after nasals Adjacent vowels match in nasality

Having defined these restrictions, we next turn to a construction in which the morphological and prosodic views of the couplet make differing predictions.

4.4 Vocalic enclitics

The morphological and prosodic views of the Mixtec couplet make different predictions regarding the behavior of a particular construction in SMP Mixtec in which there is a non-isomorphism between the morphological root and the bimoraic foot. The key forms involve bimoraic roots followed by the enclitic pronouns described in §2, which completely or partially overwrite the preceding vowel (61).

    1. (61)
    1. Nta’ǎ ì
    1. /ⁿtaˀǎ =î/ → [ⁿtaˀ=î]
    2. hand=1sg
    1. ‘My hand/mi mano’

When vocalic enclitics like [=ì] overwrite the final vowel of the root to which they attach, they take over the timing slot of the root-final vowel and are therefore foot-internal, as we will argue shortly. In these cases, the foot and morpheme boundaries are non-isomorphic, as visualized in the following pair of examples. (62) shows the bimoraic root [ⁿtaˀǎ] (‘hand’), in which the root and foot boundaries are aligned, and (63) shows the morphologically-complex foot [ⁿtaˀ=î] (‘my hand’), in which the morpheme and foot boundaries are misaligned.11

    1. (62)
    1. Nta’ǎ
    2. Ft(Rt| ⁿtaˀǎ |)
    1. ‘Hand/mano’
    1. (63)
    1. Nta’ǎ ì
    1. Ft(
    2.  
    1. Rt|
    2.  
    1. ⁿtaˀ
    2. hand
    1. |
    2.  
    1. =î
    2. =1sg
    1. )
    2.  
    1. ‘My hand/mi mano’

Given this non-isomorphism between root boundaries and foot boundaries, morphological and prosodic analyses of the Mixtec couplet make differing predictions. Since (62) involves a single root and (63) involves a morphologically-complex structure, the ‘couplet = root’ approach allows for the two words to behave differently with respect to patterns that take the couplet as their domain. However, since (62) and (63) each constitute a single foot, the ‘couplet = foot’ approach predicts no difference between them with respect to characteristics of the couplet.

As we show in detail below, restrictions on the co-occurrence of oral and nasal segments within roots in SMP Mixtec may be violated in morphologically complex structures. These violations occur across morpheme boundaries that are foot-internal (63). This means that morphological structure must be used to define the domains in which nasal phonotactics hold—prosodic structure alone is not sufficient.

4.4.1 Vocalic enclitics are foot-internal

We first provide some arguments that vocalic enclitics are, in fact, foot-internal. First, when vocalic enclitic pronouns overwrite root-final vowels, the resulting vowels are monomoraic and not bimoraic. This can be diagnosed through tonal patterning. Root-final vowels with a low-to-high rising tone can be overwritten by an enclitic that has a low-tone (§2). Instead of concatenating these three tones into a low-high-low contour, the sequence simplifies to a high-low falling tone, as shown in (64).

    1. (64)
    1. a.
    1. Sàtǎ
    2. [sàʰtǎ]
    1. ‘Back/espalda’
    1.  
    1. b.
    1. Sàtǎ ì
    1. [sàʰt=î] (cf. tri-tonal *[sàʰt=i᷈])
    2. back=1sg
    1. ‘My back/mi espalda’

This pattern suggests that the surface form in (64b) is bimoraic, not trimoraic, since a single mora may only host bi-tonal contours in SMP Mixtec (§2). Any tri-tonal or larger contours are spread across two morae (Peters 2018; Eischens & Hedding 2025). The reduction of a LHL sequence to HL in (65b) indicates that the second vowel is mono-moraic and therefore unable to host a tri-tonal contour. As a result, the entire structure in (64b) is bimoraic, meeting the criteria for constituting a single foot.

Another argument that enclitic vowels are foot-internal is that they trigger alternations on preceding consonants in the root. Palatalized consonants may not appear immediately before the high front vowel [i] (Stremel 2022; Eischens & Hedding 2025). When the 1sg enclitic [=ì] attaches to roots ending in a CV syllable with a palatalized onset consonant, it triggers depalatalization (65). When the consonant is separated from the [i] by another vowel, the alternation does not apply (66).

    1. (65)
    1. a.
    1. Xá’ntsia
    1. [ʃáˀⁿ͡tsʲa]
    2. cont.cut
    1. ‘Is cutting/está cortando’
    1.  
    1. b.
    1. Xá’ntsia ì
    1. [ʃáˀⁿ͡ts=ì]
    2. cont.cut=1sg
    1. ‘I am cutting/estoy cortando’
    1. (66)
    1. a.
    1. Tsiàà
    2. [͡tsʲàà]
    1. ‘Clothes/ropa’
    1.  
    1. b.
    1. Tsiàà ì
    1. [͡tsʲà=ì]
    2. clothes=1sg
    1. ‘My clothes/mi ropa’

We take this as evidence that the enclitic vowel is the nucleus to the second syllable, as in (67).

    1. (67)
    1. Xá’ntsia ì
    1. [.ʃáˀ.ⁿ͡ts=ì.]
    2. cont.cut=1sg
    1. ‘I am cutting/estoy cortando’

To propose that the enclitic [=ì] is foot-external in (67) requires the onset and nucleus of a syllable to be separated by a foot boundary, violating the requirement that syllable and foot boundaries align (e.g. McCarthy & Prince 2004).

A final argument in favor of the foot-internal analysis of vocalic enclitics comes from the non-contrastive [h] discussed in §3.1.2. In particular, vocalic enclitics do not trigger deletion of non-contrastive [h], as seen earlier in (64). Given that non-contrastive [h] only precedes foot-internal voiceless consonants, the medial consonant in examples like [sàʰt=î] (64) must be foot-internal. For the vocalic enclitic to be foot-external while the medial consonant is foot-internal would require implausible combinations of foot and syllable structure, visualized below. (68a) violates the principle of syllable integrity, because it requires a foot boundary to fall within a syllable (as just discussed for (67)). (68b) requires the medial consonant to be exceptionally syllabified as a coda, even though codas are otherwise banned without exception in SMP Mixtec. Finally, (68c) requires a degenerate syllable with no vowel at all, another construction for which there is no evidence in SMP Mixtec.

    1. (68)
    1. a.
    1. (.CV.C)=V.
    1.  
    1. b.
    1. (.CVC.)=V.
    1.  
    1. c.
    1. (.CV.C.)=V.

The simplest solution to these issues is to assume that vocalic enclitics are instead foot-internal [(.CV.C=V.)], with a prosodic structure that is entirely unremarkable. We thus reject the analysis of vocalic enclitic pronouns as foot-external (and, by extension, prosodic-word-external).

4.5 Violations of nasal phonotactics across morpheme boundaries

We turn now to cases in which phonotactic restrictions on nasal and oral segments—repeated in Table 8—are violated across the boundary between a root and vocalic enclitic. These patterns point to the root, not the foot, as the domain within which constraints on nasality are defined.

Table 8: Nasal-oral co-occurrence restrictions.

Restriction 1 Restriction 2 Restriction 3
Vowels are oral after ⁿCs and approximants Vowels are nasal after nasals Adjacent vowels match in nasality

Recall that vocalic enclitic pronouns in SMP Mixtec behave differently with respect to nasality (Table 9). One is invariably nasal, another is invariably oral, and the others alternate in their nasality. In this section, we focus on enclitics with invariable nasality/orality; the following section considers alternations in enclitic nasality.

Table 9: Vocalic pronominal enclitics in SMP Mixtec.

Pronoun Nasality
1sg [=ì] Takes on nasality/orality of final vowel of root
1pl.incl [=é] Always oral
2sg [=ṹ] Always nasal
3sg.n [=à] Alternates somewhat predictably in nasality

We illustrate patterns of enclitic nasality with nasalance traces from individual examples/tokens, as well as with summary nasalance plots. In summary nasalance plots, we include the first three enclitics in Table 9 (1sg [=ì], 1pl.incl [=é], and 2sg [=ṹ]), but not 3sg.n [=à]. The reason for excluding 3sg.n [=à] is that its nasal/oral alternation is more variable than the other three enclitics, making it less straightforward to interpret in summary plots; we discuss this variability in §4.6.

The pronouns that are invariably oral or nasal often give rise to violations of the phonotactic restrictions in Table 8. For example, restriction 1 states that prenasalized consonants and approximants are always followed by oral vowels. This can be violated in constructions involving the 2sg enclitic [=ṹ]. (69)–(70) show the nasal enclitic [=ṹ] following approximants. The air pressure patterns in Figure 9 show high and rising nasalance on the word-final vowel.

    1. (69)
    1. a.
    1. Yivà
    2. [jiβà]
    1. ‘Plant/hierba’
    1.  
    1. b.
    1. Yivà ún
    1. [jiβ=ũ̌]
    2. plant=2sg
    1. ‘Your plant/tu hierba’
    1. (70)
    1. a.
    1. Tsiâyì
    2. [͡tsʲâjì]
    1. ‘Chair/silla’
    1.  
    1. b.
    1. Tsiâyì ún
    1. [͡tsʲâj=ũ̌]
    2. chair=2sg
    1. ‘Your chair/tu silla’

Figure 9: [jiβ=ũ̌] (69b) (Speaker NGC) (compare with Figure 3 [jiβa] (69a)).

This pattern is consistent within and between the two speakers recorded, as shown in Figure 10. Specifically, when following approximants, nasalance for the 2sg [=ũ] is clearly higher than for other enclitics, as well as unmodified, root-final vowels.

Figure 10: Mean nasalance for enclitic vowels and root-final vowels after approximants /β j l/. Apparent violation of nasal phonotactics highlighted in grey. See Figure 2 for additional details.

In the same way, the 2sg enclitic [=ṹ] can follow prenasalized consonants, as shown in (71)–(72) and Figure 11.12

    1. (71)
    1. Kòntò
    2. [kòⁿtò]
    1. ‘Knee/rodilla’
    1. (72)
    1. Kòntò ún
    1. [kòⁿt=ũ̌]
    2. knee=2sg
    1. ‘Your knee/tu rodilla’

Figure 11: [kòⁿt=ũ̌] (72b) (Speaker NGC) (compare with Figure 4 [kòⁿtò] (71)).

The 2sg [=ṹ] is consistently nasal after pre-nasalized stops and affricates across the items tested, as shown in Figure 12. Nasalance for other enclitics and unmodified, root-final vowels is low in this environment. It is clear that restriction 1 can be violated in morphologically-complex constructions involving inherently nasal vocalic enclitics.

Figure 12: Mean nasalance for enclitic vowels and root-final vowels after prenasalized stops and affricates /ⁿt ⁿ͡ts ⁿ͡tsʲ ⁿ͡tʃ/. Apparent violation of nasal phonotactics highlighted in grey. See Figure 2 for additional details.

Restriction 2 can also be violated in morphologically-complex environments. This restriction states that vowels following nasal consonants are obligatorily nasal. However, this generalization can be violated in constructions containing the 1pl.incl enclitic [=é].13 This is exemplified in (73)–(74). The air pressure example for speaker NGC in Figure 13 shows final [=é] with lower nasalance than for preceding nasal [ã] — the nasal vowel which generally has the lowest nasalance overall in our data (Figure 2).14

While there is clearly some coarticulatory nasality on [=é], due to the influence of the preceding nasal [ɲ], nasalance declines significantly over the course of the vowel. This pattern of descending nasalance is not characteristic of phonologically nasal vowels in our data (as is clear from plots provided here). We conclude that [=é] remains phonologically oral in this environment.

    1. (73)
    1. a.
    1. Ñani
    2. [ɲãnĩ]
    1. ‘Brother/hermano’
    1.  
    1. b.
    1. Ñani é
    1. [ɲãɲ=ě]
    2. brother=1pl.incl
    1. ‘Our brother/nuestro hermano’
    1. (74)
    1. a.
    1. Ìjmǎ
    2. [ìʰmã̌]
    1. ‘Wax/cera’
    1.  
    1. b.
    1. Ìjmǎ é
    1. [ìʰm=ě]
    2. wax=1pl.incl
    1. ‘Our wax/nuestra cera’

Figure 13: [ɲãɲ=ě] (73b) (Speaker NGC) (compare with Figure 5 [ɲãɲĩ] (73a)).

The low nasalance of 1pl.incl [=é] after a nasal consonant is apparent in the summary plot in Figure 14. Though nasalance for [=é] is somewhat higher than the oral baseline, this is likely due to coarticulation with preceding nasal consonants; its nasalance is far lower than that of the other enclitics, and root-final vowels, which are all clearly nasalized.

Figure 14: Mean nasalance for enclitic vowels and root-final vowels after nasal /m n ɲ/. Apparent violation of nasal phonotactics highlighted in grey. See Figure 2 for additional details.

Restriction 3, which requires that adjacent vowels match in nasality, can also be violated in morphologically-complex constructions. In fact, it can be violated in both directions. First, if an inherently nasal enclitic overwrites the second of two adjacent oral vowels, the result is an oral-nasal sequence (75)–(76). A nasalance plot for (75b) is given in Figure 15.

    1. (75)
    1. a.
    1. Xàà
    2. [ʃàà]
    1. ‘Chin/barbilla’
    1.  
    1. b.
    1. Xàà ún
    1. [ʃà=ũ̌]
    2. chin=2sg
    1. ‘Your chin/tu barbilla’
    1. (76)
    1. a.
    1. Tsìkuǐì
    2. [͡tsì̥kʷǐì]
    1. ‘Water/agua’
    1.  
    1. b.
    1. Tsìkuǐì ún
    1. [͡tsì̥kʷǐ=ũ̌]
    2. water=2sg
    1. ‘Your water/tu agua’

Figure 15: [ʃà=ũ̌] (75b) (Speaker RDC).

Second, if the inherently oral 1pl.incl [=é] overwrites the second of two nasal vowels, the result is a nasal-oral sequence (77). This is illustrated in Figure 16.

    1. (77)
    1. a.
    1. Tsiáàn
    2. [͡tsʲã́ã̀]
    1. ‘Forehead/frente’
    1.  
    1. b.
    1. Tsiáàn é
    1. [͡tsʲã́=ě]
    2. forehead=1pl.incl
    1. ‘Our foreheads/nuestras frentes’

Figure 16: [͡tsʲã́=ě] (77b) (Speaker NGC).

The nasality of the 2sg [=ṹ] in hiatus with a preceding oral vowel, and the orality of the 1pl.incl [=é] in hiatus with a preceding nasal vowel, are shown in a summary plot in Figure 17.

Figure 17: Mean nasalance for enclitic vowels immediately following stem-final vowels, after vowel overwriting in /CV1V1 = V2/ → [CV1 = V2]. Apparent violation of nasal phonotactics highlighted in grey. See Figure 2 for additional details.

We have seen that restrictions on the co-occurrence of nasal and oral segments hold robustly root-internally, but are violated across morpheme boundaries. Importantly, these restrictions do not necessarily hold foot-internally, as shown schematically in (78)–(80). If vocalic enclitic pronouns are foot-internal, as we claim, segment sequences which violate nasal-oral co-occurrence restrictions are contained within the same foot.

    1. (78)
    1. Nasal vowel after prenasalized stop
    1. Kòntò ún
    2. Ft(Rt| kòⁿt|=ũ̌)
    1. ‘Your knee/tu rodilla’
    1. (79)
    1. Oral vowel after nasal consonant
    1. Nánà é
    2. Ft(Rt| nã́n|=ě)
    1. ‘Our mother/nuestra madre’
    1. (80)
    1. Mismatching adjacent vowels
    1. Xàà ún
    2. Ft(Rt| ʃà|=ũ̌)
    1. ‘Your chin/tu barbilla’

Enclitic pronouns are foot-internal, but external to the domain of nasal phonotactics (being outside of the root, morphologically). It follows that the foot does not define the domain of phonotactic restrictions on nasality.

These patterns provide evidence that the couplet in SMP Mixtec is not exclusively coextensive with the foot, contrary to the patterns discussed in §3. Instead, nasal phonotactics—argued to hold over the couplet in many Mixtec languages (§4.2)—coincide with the morphological root in SMP Mixtec.

The data discussed in this section have involved enclitic pronouns whose nasality/orality is invariant. This leaves out two enclitics, namely 1sg [=ì] and 3sg.n [=à], which alternate in nasality. In the following section, we briefly describe the behavior of these enclitics, arguing that they are fully consistent with our claim that the root, and not the foot, is the domain of nasal co-occurrence restrictions in SMP Mixtec.

4.6 Alternations across morpheme boundaries

As noted in §2, the 1sg and 3sg.n enclitics alternate in nasality depending on the stem they attach to. In this section, we first discuss their shared behavior, and then we highlight cases in which the 3sg.n enclitic [=à] behaves variably. In most cases, however, the nasality of both of these enclitics can be predicted from the nasality of the stem-final vowel that they overwrite.

The nasality of the 1sg enclitic [=ì] is completely predictable based on the nasality of the stem-final vowel, and the same is usually true of the 3sg.n enclitic [=à]. Specifically, they surface as oral when overwriting an oral vowel (81), and as nasal when overwriting a nasal vowel (82), illustrated in Figures 18, 19.

    1. (81)
    1. a.
    1. Kùku
    1. [kùʰku]
    2. pot.sew
    1. ‘sewǃ/bordaǃ’
    1.  
    1. b.
    1. Kúku ì
    1. [kúʰk=ì]
    2. cont.sew=1sg
    1. ‘I sew/yo bordo’
    1.  
    1. c.
    1. Kúku à
    1. [kúʰkʲ=à]
    2. cont.sew=3sg.n
    1. ‘They sew/borda’
    1. (82)
    1. a.
    1. Sòkǔn
    2. [sòʰkũ̌]
    1. ‘Neck/cuello’
    1.  
    1. b.
    1. Sòkǔn ì
    1. [sòʰk=ĩ̂]
    2. neck=1sg
    1. ‘My neck/mi cuello’
    1.  
    1. c.
    1. Sòkǔn à
    1. [sòʰk=ã̂]
    2. neck=3sg.n
    1. ‘Their neck/su cuello’

Figure 18: [kúʰk=ì] (81b) vs. [sòʰk=ĩ̂] (82b) (Speaker NGC).

Figure 19: [kúʰkʲ=à] (81c) vs. [sòʰk=ã̂] (82c) (Speaker NGC).

This generalization is illustrated for the 1sg enclitic [=ì] in the summary plot in Figure 20, which shows the nasalance of vocalic enclitic pronouns when they overwrite contrastive oral and nasal root-final vowels. Alternating 1sg [=ì] has low nasalance when overwriting an oral vowel, and high nasalance when overwriting a nasal vowel. Note also the invariant nasalance of 1pl.incl [=é] and 2sg [=ṹ]; again, 3sg.n [=à] is not included in the plot due to the patterns of variability we discuss shortly.

Figure 20: Mean nasalance for stem-final and enclitic vowels after voiceless obstruents, where nasality is contrastive. Alternations based on nasality of the stem-final vowel highlighted in grey. See Figure 2 for additional details.

The same patterns hold when the nasality of the final vowel of the root is predictable: 1sg [=ì] is oral after approximants and prenasalized consonants, and the same is usually true of 3sg.n [=à] (83)–(84). Though representative nasalance traces are excluded for reasons of space, the low nasalance of 1sg [=ì] in these contexts can be seen in the earlier summary plots in Figures 10 and 12.

    1. (83)
    1. a.
    1. Tsiâyì ì
    1. [͡tsʲâj=ì]
    2. chair=1sg
    1. ‘My chair/mi silla’
    1.  
    1. b.
    1. Tsiâyì à
    1. [͡tsʲâj=à]
    2. chair=3sg.n
    1. ‘Their chair/su silla’
    1. (84)
    1. a.
    1. Kòntò ì
    1. [kòⁿt=ì]
    2. knee=1sg
    1. ‘My knee/mi rodilla’
    1.  
    1. b.
    1. Kòntò à
    1. [kòⁿt=à]
    2. knee=3sg.n
    1. ‘Their knee/su rodilla’

When following a nasal consonant, both enclitics always surface as nasal. This is exemplified in (85) and illustrated for 1sg [=ì] in the summary plot shown earlier in Figure 14.

    1. (85)
    1. a.
    1. Koñù ì
    1. [koɲ=ĩ̀]
    2. meat=1sg
    1. ‘My meat/mi carne’
    1.  
    1. b.
    1. Koñù à
    1. [koɲ=ã̀]
    2. meat=3sg.n
    1. ‘Their meat/su carne’

Finally, when overwriting the second of two consecutive vowels, 1sg [=ì] takes on the nasality of the preceding vowel (86a), (87a), as seen earlier in Figure 17. The 3sg.n enclitic usually does, as well (86b), (87b).

    1. (86)
    1. a.
    1. Tsìkuǐì ì
    1. [͡tsì̥kʷuǐ=ì]
    2. water=1sg
    1. ‘My water/mi agua’
    1.  
    1. b.
    1. Tsìkuǐì à
    1. [͡tsì̥kʷǐ=à]
    2. water=3sg.n
    1. ‘Its water/su agua’
    1. (87)
    1. a.
    1. Káchúun ì
    1. [ká͡tʃṹ=ĩ̀]
    2. word.cont=1sg
    1. ‘I work/trabajo’
    1.  
    1. b.
    1. Káchúun à
    1. [ká͡tʃṹ=ã̀]
    2. work.cont=3sg.n
    1. ‘They work/trabaja’

4.6.1 Variability in 3sg.n [=à]

The 3sg.n [=à] enclitic displays some variability that merits discussion. While it often preserves the nasality or orality of the stem-final vowel, it sometimes surfaces as nasal when overwriting an oral vowel, as seen in (88) and Figures 21, 22. In these examples, there is no coarticulatory source for the increased nasalance on the 3sg.n enclitic.

    1. (88)
    1. a.
    1. Xàà à
    1. [ʃà=ã̀]
    2. chin=3sg.n
    1. ‘Their chin/su barbilla’
    1.  
    1. b.
    1. Yiva à
    1. [jiβ=ã̀]
    2. plant=3sg.n
    1. ‘Their plant/su hierba’

Figure 21: [ʃà=ã̀] (88a) (Speaker NGC).

Figure 22: [jiβ=ã̀] (88b) (Speaker RDC).

Almost all of the cases of this exceptional nasality in our dataset occur when the enclitic overwrites a low vowel /a/. However, the nasality of 3sg.n [=à] is not entirely predictable from the quality of the vowel it overwrites, and may be subject to interspeaker variation. For example, speaker NGC produced all repetitions of /ⁿ͡tʃiʰʃǐ =à/ ‘their corncob/su elote’ in (89) with a final oral vowel (Figure 23), while speaker RDC produced it with a final nasal vowel (Figure 24).

    1. (89)
    1. Nchixǐ à
    1. [n͡tʃiʰʃ=â] ∼ [n͡tʃiʰʃ=ã̂]
    2. corncob=3sg.n
    1. ‘Their corncob/su elote’

Figure 23: [ⁿ͡tʃiʰʃ=â] (89) (Speaker NGC).

Figure 24: [ⁿ͡tʃiʰʃ=ã̂] (89) (Speaker RDC).

It appears that the 3sg.n enclitic [=à] does alternate in nasality, but less predictably than the 1sg enclitic [=ì]. Because of this, we have excluded it from summary plots throughout, and we leave a precise analysis of its alternation for future work. However, we note that it, too, can trigger violations of the nasal phonotactic constraints described in §4.2. For example, it occurs as nasal in hiatus with an oral vowel (Figure 21) and after an approximant (Figure 22), providing another case where nasal phonotactic constraints are violated across morpheme boundaries.

4.6.2 Analysis of enclitic alternations for nasality

In summary, the 1sg enclitic (and, usually, the 3sg.n enclitic) alternates to maintain the nasality of the stem-final vowel. Importantly, these alternations do not reflect co-occurrence restrictions on nasal and oral segments. For example, /=V/ enclitics maintain the nasality of the stem-final vowel even when the preceding segment is [k] (81)–(82), which imposes no restrictions on the nasality or orality of following vowels. Given this, we understand these alternations not to be driven purely by phonotactic restrictions on the co-occurrence of oral and nasal segments (cf. Penner 2019: 266–267), but rather to be driven by a pressure to maintain the nasality/orality of the stem-final vowel. The question raised by this fact is why some enclitics alternate, while others do not.

The first pattern to consider is the non-variation of the 1pl.incl enclitic [=é], which is invariably oral. We propose that the solution lies in a general ban on nasal [ẽ] in SMP Mixtec. Because /ẽ/ is absent from the phoneme inventory (Peters 2018; Ostrove 2018; Eischens & Hedding 2025), and because there are no environments or constructions in which [ẽ] appears, either within roots or across morpheme boundaries, we propose that there is an undominated, inviolable constraint penalizing nasal [ẽ].

The second pattern to consider is that 1sg [=ì] and 3sg.n [=à] alternate in nasality, but 2sg [=ṹ] is invariably nasal. Importantly, both of these patterns serve to avoid the deletion of underlying [nas] features: when 1sg [=ì] is nasalized, the underlying [nas] feature of the stem is maintained (90).

    1. (90)
    1. Nchixǎn ì
    1. /ⁿ͡tʃiʰʃã̌=ì/
    2. shoe=1sg
    1.  
    1. [ⁿ͡tʃiʰʃ=ĩ̂]
    2.  
    1. ‘My shoe/mi zapato’

Though 3sg.n [=à] alternates less predictably in nasality, it does not appear to be inherently nasal, meaning its alternation does not involve the deletion of a [nas] feature.

Additionally, the non-alternation of the 2sg [=ṹ] also serves to maintain an underlying [nas] feature, this time associated to the enclitic (91).

    1. (91)
    1. Kíxa ún
    1. /kíʰʃa=ṹ/
    2. cont.do=2sg
    1.  
    1. [kíʰʃ=ṹ]
    2.  
    1. ‘You do/haces’

For the 2sg [=ṹ] to surface as oral [=ú] would require the deletion of a [nas] feature. Deletion of a [nas] feature appears to be disallowed in SMP Mixtec, except to avoid the creation of a nasal [ẽ] when the 1pl.incl /=é/ attaches to a stem ending in a nasal vowel. In this light, the alternation in 1sg [=ì] and 3sg.n [=à], as well as the lack of alternation in 2sg [=ṹ], can be driven by a pressure against the deletion a [nas] feature.

4.7 Summary

In this section, we have shown that phonotactic restrictions on the co-occurrence of nasal and oral segments hold exceptionlessly within the root, while strings that violate these restrictions are always separated by a morpheme boundary. Considered in isolation, this result is consistent with the morphological definition of the couplet: the domain in which nasal phonotactic rules are upheld is the root.

Purely prosodic analyses of nasal phonotactics, based on foot structure, do not make the right predictions. Sequences of segments which violate nasal phonotactic restrictions are not, in general, separated by foot boundaries. Instead, the entire string is contained in the same foot.

In sum, the patterns described here constitute evidence that the morphological root is a crucial domain in defining the behavior of nasality. This conclusion is different from §3, where the bimoraic foot and not the morphological root was necessary to describe the domain in which laryngeals and underlying rising tones may surface.

These two results are not necessarily at odds with each other: in SMP Mixtec, some phonological patterns are defined in terms of the foot, and some are defined in terms of morphological structure. However, from the perspective that the Mixtec ‘couplet’ stands in a one-to-one correspondence relation with a single grammatical unit, these results are problematic. The correct conclusion appears to be that properties attributed to the couplet are not all properties of a single category. Instead, the Mixtec ‘couplet’ is a just shorthand term for the set of patterns which occur in canonical roots: morphologically-simplex, bimoraic words (Carroll 2015: 56). We return to this theme in §5, after a consideration of alternative analyses of nasal phonotactics which attempt to maintain a purely prosodic characterization of those patterns.

4.8 Alternative analyses

4.8.1 Directional spreading with morpheme structure constraints

Assume—counter to our actual analysis—that the foot is the domain of nasal phonotactics in SMP Mixtec, rather than the root. Given the evidence that enclitics are foot-internal, just like root-final vowels (§4.4), the challenge is then to find some other way to capture the exceptional behavior of enclitics with respect to nasal phonotactics.

One approach is to assume that the feature [nasal] may only spread rightward, not leftward. Enclitics, being at the right edge, could then only be the targets of nasal spreading, and not the source. This predicts the possibility of mismatches in nasality between a [nasal] enclitic and a preceding segment.

Under this approach, the 1sg enclitic [=ì] and 3sg.n enclitic [=à] nasalize via rightward spread of a [nas] feature (92).

    1. (92)
    1. Tsiáàn ì
    1. /͡tsʲã́ã̀ =ì/
    2. forehead=1sg
    1.  
    1. [͡tsʲã́=ĩ̀]
    2.  
    1. ‘My forehead/mi frente’

Forms like [͡tsʲã́=ě] ‘our foreheads’, with a nasal-oral vowel sequence, could be explained by a blanket constraint against nasal [ẽ], as in our approach.

In contrast, in a form like (93), the lack of nasality on the penultimate mora would be due to the inability of the [nas] feature to spread leftward. This results in a mismatched oral-nasal vowel sequence.

    1. (93)
    1. Tsìkuǐì ún
    1. /͡tsì̥kʷǐì=ũ̌/
    2. water=2sg
    1.  
    1. [͡tsì̥kʷǐ=ũ̌]
    2.  
    1. ‘Your water/tu agua’

Importantly, this alternative analysis does not refer to morphological structure in any direct way. However, it faces at least three problems. First, it has no account of forms like /jiβa=ũ̌/ → [jiβ=ũ̌] ‘your plant/tu hierba’, where a nasal enclitic vowel follows a root consonant which cannot normally be followed by a nasal vowel (here, an approximant). We have argued that such forms are footed [(jiβ=ũ̌)], plainly contradicting the claim that the foot is the domain of nasal phonotactics (including spreading, if any such spreading occurs). While a ban on leftward nasal spreading correctly rules out /jiβa=ũ̌/ → *[jim=ũ̌] (for example), it cannot explain why root=enclitic combinations like [jiβ=ũ̌] occur, even though roots like *[jiβũ̌] are entirely unattested. Some reference to morphology is still required.

Along the same lines, if one assumes that there are no constraints on underlying forms (‘richness of the base’, Prince & Smolensky 1993/2004), then this approach predicts that roots of the shape [CVṼ] should exist. Since nasality may only spread rightward, an input /CVṼ/ would surface unchanged as [CVṼ]. Such forms do not occur in SMP Mixtec.

This problem could be addressed by adopting constraints which directly ban roots with the underlying form /CVṼ/ (contra richness of the base). However, such an approach also refers to morphological structure, distinguishing roots from multi-morphemic forms, independent of their surface foot structure.

To resolve these problems, the ban on leftward spreading of nasality could be revised so that it only bans leftward spreading from enclitics to roots, and not leftward spreading in general. But likewise, this ‘solution’ still invokes morphological structure, essentially recapitulating our claim that the domain of nasal phonotactics is the morphological root in SMP Mixtec.

Lastly, many analyses of nasality in Mixtec languages represent nasalization as a feature that originates at the right edge of the couplet and spreads leftward (Marlett 1992; Gerfen 1999), making a ban on leftward spread unusual among Mixtec languages.

4.8.2 Treating enclitics as systematically foot-external

A second alternative is to insist that (i) the foot is the domain of nasal phonotactics, and (ii) vocalic /=V/ enclitic pronouns are foot-external. This would be contrary to our claim that vocalic /=V/ enclitics are parsed inside a foot with preceding root material (§4.4).

This approach is partly motivated by the fact that morpho-syntactic clitics have a tendency to be prosodically deficient, and to occur outside of the prosodic domain containing their host (Anderson 2005). Further, we claim that consonant-initial /=CV/ enclitics are outside the prosodic word of their host, and by extension, external to the bimoraic foot (§3.1.2). It is of course reasonable to wonder whether /=V/ enclitics might have the same prosodic status as unfooted /=CV/ enclitics.

If we assume that /=V/ enclitics are foot-external, the fact that nasal phonotactics are disobeyed across root-enclitic boundaries reflects the presence of a foot boundary between the root and the enclitic, as in (94).

    1. (94)
    1. Tsìkuǐì ún
    1. /͡tsì̥Ft(kʷǐì)=ũ̌/
    2. water=2sg
    1.  
    1. [͡tsì̥Ft(kʷǐ)ũ̌]
    2.  
    1. ‘Your water/tu agua’

Here, the prosodic analysis of the couplet remains intact: the domain of nasal phonotactics is the foot, and vocalic enclitics are unfooted.

This analysis faces two main problems. First, we have already laid out several arguments in §4.4 that vocalic enclitics are foot-internal. Those arguments would need to be addressed before this alternative analysis could be considered plausible.

Second, this approach still crucially refers to morphological structure. In proposing that vocalic enclitics are foot-external, it equates a morphological distinction with a prosodic one. The reason that the enclitics are (supposedly) foot-external is simply because they are root-external. In other words, the only rationale for the proposed prosodic difference between roots and enclitics is their different morphological status.

Given that this analysis must ultimately refer to morphological structure, it does not account for the data by making reference to the foot alone. For that reason, we reject this approach as meaningfully different from our own proposal that the Mixtec couplet must be defined in both morphological and prosodic terms.

4.8.3 Separating nasalization from the couplet

An anonymous reviewer suggests that it is possible to accept that nasal phonotactics in SMP Mixtec are root-bounded, while maintaining that the couplet is always coextensive with the prosodic foot. In this approach, nasalization is not a feature of the couplet at all, but of some other domain like the root. Researchers who ascribe nasal phonotactics to the couplet (§4.2) would then simply be mistaken to do so (in SMP Mixtec, or in general). The support for this analysis comes from the fact that even researchers who define the couplet as the root still acknowledge the prosodic constraints imposed on roots (e.g., Marlett 1992: 425–426, fn2).

We do not adopt this view, for several reasons. First, though the majority of researchers take a prosodic view of the couplet, some do define it partially or entirely in morphological terms (see e.g. Penner 2019: 19 for discussion). It is not self-evidently true that the couplet is equivalent to the foot, in any variety of Mixtec—this needs to be shown on a case-by-case basis, particularly given the extensive internal linguistic diversity of the Mixtec family.15

Second, the perspective suggested by the reviewer is unfalsifiable. When a process ascribed to the couplet fails to coincide with the foot, one can of course claim that the process in question is simply irrelevant for understanding the couplet. But this is circular reasoning: the couplet is presupposed to be the foot, rather than shown to be the foot. Indeed, the reviewer’s proposal boils down to the tautology that ‘only patterns which are bounded by the foot are bounded by the foot’, because the foot and couplet are assumed to be the same thing from the outset. This is a vacuous claim, and not falsifiable.

One other point bears mentioning. If every phonotactic pattern attributed to ‘the couplet’ can instead be attributed to some combination of the foot and the root, then ‘the couplet’ ceases to have any explanatory power. It is then unclear to us what is gained by invoking ‘the couplet’ in the first place, except as a kind of loose descriptive shorthand, or as an acknowledgment of past usage of the term.

These issues do not arise in our analysis, because for us ‘the couplet’ has no independent grammatical status: it is a descriptive cover term for a set of phonological patterns in Mixtec languages, some of which are bounded by the foot, and some of which are bounded by the root. These patterns coincide in bimoraic roots, but not necessarily elsewhere.

5 Conclusion

In this paper, we have pursued two main questions. First, can the properties of the SMP Mixtec couplet be described in terms of cross-linguistically general grammatical units? And second, if so, can the SMP Mixtec couplet be unambiguously equated with one of these categories?

The answer to the first question is clearly affirmative: the distribution of laryngeals, underived rising tones, and nasal phonotactics may all be analyzed in terms of general grammatical categories like the root and the foot.

However, the answer to the second question is negative: there is no one, single grammatical category which uniquely captures both (i) the distribution of laryngeals and underived rising tones, as well as (ii) the domain in which nasal phonotactic restrictions hold. Instead, the domain of (i) appears to be the bimoraic foot, while the domain of (ii) appears to be the morphological root. This means that both views of the Mixtec couplet in Table 10 are incomplete.

Table 10: Opposing analyses of the Mixtec ‘couplet’.

Couplet = morphological root Couplet = bimoraic foot
The Mixtec couplet is representationally equivalent to the morphological root. Any property attributed to the couplet can be recast in terms of the root. The Mixtec couplet is representationally equivalent to a bimoraic foot. Any property attributed to the couplet can be recast in terms of the foot.

While this state of affairs is not surprising—it is common for different grammatical generalizations to hold in different domains—it does undermine the notion that the Mixtec couplet is a single, clear-cut, internally-coherent, grammatically primitive unit. In this sense, the Mixtec couplet is similar to some analysts’ view of categories like ‘pitch accent’ (cf. Hyman 2011): these are not primitive concepts, but instead the result of intersecting several true grammatical primitives (e.g. lexical tone and culminativity).

The only structure in which every phonological characteristic of the couplet is attested is in mono-morphemic, bimoraic roots. Put in these terms, the couplet in SMP Mixtec can be understood as an emergent category which arises out of the the interaction of (at least) the morphological root and the bimoraic foot.

Indeed, the best definition of the couplet may be the set of phonotactic properties that hold in canonical roots—mono-morphemic words that are minimally and maximally bimoraic (Carroll 2015: 56; see Uchihara & Mendoza Ruiz 2022 on a bimoraic maximum for prosodic words). In this sense, the Mixtec couplet is not a unit of its own—the properties of the couplet are amenable to analysis in terms of more general grammatical categories. At the same time, the couplet in SMP Mixtec cannot be identified with any one grammatical category: it is a constellation of properties associated either with the root or with the bimoraic foot.

A reviewer takes issue with us “chang[ing] how the couplet is defined”, relative to previous work on Mixtec languages. This fundamentally misunderstands our position. We are not just proposing to modify terms or definitions: we are claiming that the couplet is not an independent, or even consistent grammatical unit of any kind.

An analogy may help clarify. Imagine two transparent sheets of plastic, one red, and one blue. When those sheets are partially overlapped, they will produce a region which appears to be purple (Figure 25).

Figure 25: The couplet as an epiphenomenon of overlap between objects: analogy with transparent sheets of colored plastic.

The purple region is certainly ‘real’: we can see it, talk about it, and so on. And at times it may be convenient to refer to ‘the purple rectangle’ in Figure 25. But there is no physical object corresponding to that purple rectangle. It only exists by virtue of the overlap between the red and blue pieces of plastic. Its size and shape depends entirely on how those actual, physical pieces of plastic relate to each other.

This is what we mean when we say that the Mixtec couplet is not a consistent grammatical unit. The SMP Mixtec couplet exists only as the overlap between the morphological root and the bimoraic foot — it has absolutely no independent status of its own.

To put it bluntly, we claim that there are no grammatical generalizations whatsoever in SMP Mixtec which refer to something called ‘the couplet’. There are generalizations which refer to roots, to bimoraic feet, and to the relationship between them. When these units align, they may give the impression — falsely, in our view — that a third type of unit is involved (namely, the couplet). Our claim is that deeper insight is gained by decomposing the putative ‘couplet’ into its component parts, rather than presupposing that something called ‘the couplet’ exists on its own, and reasoning from there.

Our position is straightforwardly compatible with the fact that patterns attributed to the couplet, in particular nasal phonotactics, operate over different domains across different varieties of Mixtec (e.g. footnote 15). In fact, the same may be true of tonal patterns, as Carroll (2015: 202, 218) describes inter-varietal differences in whether proclitics and ‘couplets’ have different tonal characteristics. In contrast, contortions are required to reconcile these facts with the claim that ‘the couplet’ exists as a pan-Mixtec grammatical unit (§4.8).

The term ‘couplet’ remains useful for describing grammatical patterns that hold in bimoraic roots—the canonical, prototypical root in Mixtec languages. But descriptive terms, even when useful, are often the wrong tools for grammatical analysis.16 We have argued that all generalizations attributed to ‘the couplet’ in SMP Mixtec can, and should, be attributed to the morphological root or the foot instead. ‘The couplet’ is simply not up to the job: it is a convenient descriptive shorthand — albeit a rough and approximate one — but it is not a proper unit of formal, grammatical analysis.

Abbreviations

cont = continuative, dem = demonstrative, dep = dependent, ft = foot, incl = inclusive, ind = independent, m = human masculine, n = neutral noun class, rt = root, sg = singular, pl = plural, pot = potential, wd = wooden noun class.

Data availability

A .zip file containing nasalance plots for all tokens analyzed here, along with a full wordlist, is included at https://osf.io/q7n4u/overview.

Ethics and consent

Research was performed in accordance with the UCLA and UC Santa Cruz Institutional Review Boards (UCLA IRB# 22-000994, UCSC IRB# HS-FY2022-176). Informed consent was obtained from all participants involved in the study.

Acknowledgements

We are extremely grateful to the members of the San Martín Peras community in Oaxaca and California, especially Natalia Gracida Cruz and Roselia Durán Cruz, for their generosity in sharing their language with us. We also thank Andrew Hedding, Hiroto Uchihara, Christian DiCanio and audiences at WAIL 26 and the UCLA Phonology Seminar for their helpful feedback, as well as Associate Editor Björn Köhnlein and two anonymous reviewers for their insightful contributions to this project.

Competing interests

The authors have no competing interests to declare.

Notes

  1. Examples are given in a three- or four-line gloss. The first line is written in an orthography broadly following the recommendations of Ve’e Tu’un Savi (Mixtec Language Academy) (Instituto Nacional de Lenguas Indígenas 2022), and the second line is written in the IPA, with morpheme boundaries indicated. In examples with multiple morphemes, the third line is a morpheme-by-morpheme gloss, and the fourth is a free translation in English and Spanish, separated by a forward slash. In examples with just one morpheme, the morpheme-level gloss is omitted. [^]
  2. In this paper we assume a standard version of prosodic hierarchy theory, as described in e.g. Ito & Mester (2012) and many other sources. [^]
  3. The forms in (11)–(13) are all given in their potential form, but they do not undergo segmental changes in the continuative or completive, as can occur with other verbs. [^]
  4. [ʔ] is often inserted at the beginning of vowel-initial words. However, it can be analyzed as epenthetic, inserted phrase-initially and in contexts of vowel hiatus across word/morpheme boundaries. Additionally, [h] is present at the beginning of one lexical item, the demonstrative/locative [hã̀ã́] ‘that (proximal)’. [^]
  5. We only discuss underlying rising tones because there are two morphological constructions that can create rising tones outside the domain of the foot, namely negative grammatical tone (cf. Eischens 2024) and the causative prefix. These are omitted for reasons of space. The exceptionality of these rises can be viewed as a derived environment effect, where a restriction on morphologically simplex structures is violated in morphologically complex constructions. [^]
  6. Recordings made with dual-chamber masks will often show small amounts of oral air pressure in entirely nasal sounds (e.g. [m]), and small amounts of nasal air pressure in entirely oral sounds (e.g. [v]) (e.g. Kochetov 2020). For this reason, it’s important to interpret nasalance in relative rather than absolute terms. [^]
  7. For the sake of readability, in our nasalance plots we transcribe tone with superscript numbers: 1 = low, 2 = mid, 3 = high, 31 = falling, 13 = rising. [^]
  8. These are a subset of the patterns outlined in Marlett (1992), who argues that nasality in Mixtec is an autosegmental feature associated to the right edge of a morpheme, which spreads leftward until it is blocked by an obstruent. The patterns discussed here could be derived from such a process of nasal spread, either synchronically or diachronically. [^]
  9. Nasal vowels generally show lower nasal air pressure, and thus lower nasalance, than nasal consonants, as is visible in the nasalance trace for [ɲãnĩ] in Figure 5. This follows from aerodynamic principles: in nasal consonants, all of the airflow leaving the vocal tract exits through the nose, while in nasal vowels, airflow is divided between the oral and nasal channels. [^]
  10. Figures 6, 7, 8 come from separate recordings of speaker NGC at approximately 10,000 Hz in each channel using Glottal Enterprises DualView software. This setup was necessary to accurately record the lower-frequency signal of oral and nasal [h]. [^]
  11. We assume vocalic enclitic pronouns like in are internal to the prosodic word, as argued later in this section, while CV enclitic pronouns are external to the prosodic word, as discussed in §3.1.2. [^]
  12. There is some anticipatory nasalization of vowels preceding nasal and pre-nasalized consonants, as seen in Figure 11. However, these vowels are unlikely to be phonologically nasal for two reasons: First, approximants may precede these vowels (e.g. [laⁿtu] ‘bellybutton/ombligo’, [jòʰɲũ̌] ‘net/red’), while they cannot precede nasal vowels in roots. Second, high and mid vowels contrast preceding prenasalized stops (e.g. [lúⁿtú] ‘short/corto’; [lóⁿtó] ‘tadpole/renacuajo’̠). This suggests that these vowels are phonologically oral, since there is no contrast between /ũ/ and /õ/ ([õ] being entirely absent from SMP Mixtec, in both underlying and surface forms). [^]
  13. This is potentially subject to interspeaker variation. We know of one speaker who, based on impressionistic observations, seems to nasalize [=é] when it overwrites a nasal vowel. This speaker’s data is not presented here; it remains to be seen whether nasalization of [=é] after nasals in their speech pattern reflects coarticulation for nasality, or a truly nasal [ẽ]. [^]
  14. Since SMP Mixtec does not in general have nasal [ẽ] (at least for the speakers whose data we report here), we cannot directly compare Figure 13 to an unambiguously nasal [ẽ]. Hence the more indirect comparison with nasal [ã] in Figure 13. [^]
  15. For example, in stark contrast with SMP Mixtec, enclitics in Yoloxóchitl and San Pedro Tulixtlahuaca Mixtec do participate in phonotactic patterns related to nasality. Specifically, when the addition of an enclitic might derive a prohibited combination of oral and nasal segments, stem segments alternate in order to conform with broader nasal phonotactics, e.g. Yoloxóchitl /ka3nã3=e4/ → [ka3nd=e4] ‘we will call’ (see DiCanio et al. 2020: 351–2; Becerra Roldán 2019: 81, 87 for details). However, Yoloxóchitl Mixtec has a number of trimoraic roots, and root-final stress, implying the footing μ(μμ). Nasal phonotactics nonetheless hold in the initial syllable of trimoraic roots like /na3ʃa2a2/ → [nda3(ʃa2a2)] ‘to arrive to live’, just as they do in SMP Mixtec (10)–(13) (DiCanio et al. 2018; 2020). Given the behavior of enclitics, this implies that the domain of nasal phonotactics is neither the foot nor the root in Yoloxóchitl Mixtec, but rather something larger, like the prosodic word. Regardless of the analysis here, it is clear that any proposal made for a particular variety of Mixtec will not necessarily be appropriate for another variety (a point also raised by DiCanio et al. 2020). [^]
  16. All areas of linguistics offer examples of this type: e.g. descriptive terms like ‘subject’ and ‘object’ are exceedingly useful for many purposes, but have essentially no status in modern syntactic theory (e.g. Chomsky 1965: 68–74; McCloskey 1997). For a similar example in phonology, consider the fact that mid vowels exist, but the distinctive feature [± mid] apparently does not (e.g. Odden 2005; Hayes 2009). [^]

References

Alexander, Ruth Mary. 1988. A syntactic sketch of Ocotepec Mixtec. Studies in the Syntax of Mixtecan Languages 1. 151–304.

Anderson, Stephen R. 2005. Aspects of the theory of clitics. Oxford: Oxford University Press. DOI:  http://doi.org/10.1093/acprof:oso/9780199279906.001.0001

Auderset, Sandra & Hernández Martínez, Carmen & Ventayol-Boada, Albert. 2024. Constituency in Tù’un Ntá’ví (Mixtec) of San Martín Duraznos. In Tallman, Adam J. R. & Auderset, Sandra & Uchihara, Hiroto (eds.), Constituency and convergence in the Americas, 265–304. Language Science Press.

Bat-El, Outi. 1994. Stem modification and cluster transfer in Modern Hebrew. Natural Language & Linguistic Theory 12(4). 571–596. DOI:  http://doi.org/10.1007/BF00992928

Bat-El, Outi. 2003. Semitic verb structure within a universal perspective. In Shimron, Joseph (ed.), Language processing and acquisition in languages of Semitic, root-based, morphology, 29–59. Amsterdam: John Benjamins. DOI:  http://doi.org/10.1075/lald.28.02bat

Becerra Roldán, Braulio. 2019. Análisis sincrónico y consideraciones diacrónicas sobre la fonología del mixteco de San Pedro Tulixtlahuaca. Mexico City: Universidad Nacional Autónoma de México MA thesis.

Becerra Roldán, Braulio. 2023. Las vocales nasales en el mixteco de Pinotepa Nacional: comparación dialectal, pérdida de la nasalidad y consonantización. Ruta Antropológica 13. 65–100.

Beck, David. 2016. Some language-particular terms are comparative concepts. Linguistic Typology 20(2). 395–402. DOI:  http://doi.org/10.1515/lingty-2016-0013

Belmar Viernes, Guillem. 2024. Community-informed documentary linguistics and community-led participatory research: describing Sà’án Sàvǐ ñà ñuù Xnúvíkó and analyzing speakers’ insights on intelligibility with Tlahuapa Mixtec. Santa Barbara: University of California dissertation.

Bennett, Ryan. 2012. Foot-conditioned phonotactics and prosodic constituency. Santa Cruz: University of California dissertation.

Boersma, Paul & Weenink, David. 2020. Praat: Doing phonetics by computer (version 6.1.27).

Campbell, Eric W. 2017. Otomanguean historical linguistics: Exploring the subgroups. Language and Linguistics Compass 11(7). e12244. DOI:  http://doi.org/10.1111/lnc3.12244

Carroll, Lucien Serapio. 2015. Ixpantepec Nieves Mixtec word prosody. San Diego: University of California dissertation.

Cheng, Lisa Lai Shen. 1987. On the prosodic hierarchy and tone sandhi in Mandarin. In Toronto working papers in linguistics, vol. 7, 24–52.

Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT press. DOI:  http://doi.org/10.21236/AD0616323

Daly, John P. & Hyman, Larry M. 2007. On the representation of tone in Peñoles Mixtec. International Journal of American Linguistics 73(2). 165–207. DOI:  http://doi.org/10.1086/519057

de León Pasquel, María de Lourdes. 1988. Noun and numeral classifiers in Mixtec and Tzotzil: A referential view. United Kingdom: University of Sussex dissertation.

DiCanio, Christian & Amith, Jonathan & García, Rey Castillo. 2014. The phonetics of moraic alignment in Yoloxóchitl Mixtec. In Proceedings of the 4th international symposium on tonal aspects of language, 203–210.

DiCanio, Christian & Benn, Joshua & García, Rey Castillo. 2018. The phonetics of information structure in Yoloxóchitl Mixtec. Journal of Phonetics 68. 50–68. DOI:  http://doi.org/10.1016/j.wocn.2018.03.001

DiCanio, Christian T. & Zhang, Caicai & Whalen, Douglas H. & García, Rey Castillo. 2020. Phonetic structure in Yoloxóchitl Mixtec consonants. Journal of the International Phonetic Association 50(3). 333–365. DOI:  http://doi.org/10.1017/S0025100318000294

Eischens, Ben. 2024. Classifying negated nominals across Mixtec. International Journal of American Linguistics 90(2). 141–183. DOI:  http://doi.org/10.1086/728641

Eischens, Ben & Hedding, Andrew. 2025. San Martín Peras Mixtec. Journal of the International Phonetic Association 54(2). 811–852. DOI:  http://doi.org/10.1017/S0025100324000124

Gerfen, Chip. 1996. Topics in the phonology and phonetics of Coatzospan Mixtec. Tucson: The University of Arizona dissertation.

Gerfen, Chip. 1999. Phonology and phonetics in Coatzospan Mixtec, vol. 48. Springer Science & Business Media.

Gordon, Matthew. 2002. A factorial typology of quantity-insensitive stress. Natural Language & Linguistic Theory 20(3). 491–552. DOI:  http://doi.org/10.1023/A:1015810531699

Haspelmath, Martin. 2010. Comparative concepts and descriptive categories in crosslinguistic studies. Language 86(3). 663–687. DOI:  http://doi.org/10.1353/lan.2010.0021

Hayes, Bruce. 2009. Introductory phonology. Malden, MA: Wiley-Blackwell.

Hedding, Andrew A. 2022. How to move a focus: The syntax of alternative particles. Santa Cruz: University of California dissertation.

Herrera Zendejas, Esther. 2014. Mapa fónico de las lenguas mexicanas: Formas sonoras 1 y 2, vol. 19. Mexico City: El Colegio de México AC.

Hunter, Georgia G. & Pike, Eunice V. 1969. The phonology and tone sandhi of Molinos Mixtec. Linguistics 47. 24–40. DOI:  http://doi.org/10.1515/ling.1969.7.47.24

Hyman, Larry M. 2011. The representation of tone. The Blackwell companion to phonology, 1–25. DOI:  http://doi.org/10.1002/9781444335262.wbctp0045

Instituto Nacional de Estadística y Geografía. 2020. Censo de población y vivienda.

Instituto Nacional de Lenguas Indígenas. 2022. Norma de Escritura del Tu’un Savi (Idioma Mixteco).

Ito, Junko & Mester, Armin. 2012. Recursive prosodic phrasing in Japanese. In Borowsky, Toni & Kawahara, Shigeto & Shinya, Takahito & Sugahara, Mariko (eds.), Prosody matters: Essays in honor of Elisabeth Selkirk, 280–303. London: Equinox.

Iverson, Gregory K. & Salmons, Joseph C. 1996. Mixtec prenasalization as hypervoicing. International Journal of American Linguistics 62(2). 165–175. DOI:  http://doi.org/10.1086/466284

Josserand, Judy Kathryn. 1983. Mixtec dialect history. Ann Arbor: University of Michigan dissertation.

Kochetov, Alexei. 2020. Research methods in articulatory phonetics II: Studying other gestures and recent trends. Language and Linguistics Compass 14(4). e12371. DOI:  http://doi.org/10.1111/lnc3.12371

León Vásquez, Octavio. 2017. Sandhi tonal en el mixteco de Yucuquimi de Ocampo. Mexico City: Centro de Investigaciones y Estudios en Antropología Social MA thesis.

Longacre, Robert Edmondson. 1955. Proto-Mixtecan. Philadelphia: University of Pennsylvania dissertation.

Macaulay, Monica. 1987a. Cliticization and the morphosyntax of Mixtec. International Journal of American Linguistics 53(2). 119–135. DOI:  http://doi.org/10.1086/466049

Macaulay, Monica. 1987b. Morphology and cliticization in Chalcatongo Mixtec. Berkeley: University of California dissertation.

Macaulay, Monica. 2005. The Syntax of Chalcatongo Mixtec: Preverbal and Postverbal. Verb first: On the syntax of verb-initial languages 73. 341–366. DOI:  http://doi.org/10.1075/la.73.21mac

Macaulay, Monica & Salmons, Joseph C. 1995. The phonology of glottalization in Mixtec. International Journal of American Linguistics 61(1). 38–61. DOI:  http://doi.org/10.1086/466244

Macken, Marlys A. & Salmons, Joseph C. 1997. Prosodic templates in sound change. Diachronica 14(1). 31–66. DOI:  http://doi.org/10.1075/dia.14.1.03mac

Mak, Cornelia. 1953. A comparison of two Mixtec tonemic systems. International Journal of American Linguistics 19(2). 85–100. DOI:  http://doi.org/10.1086/464197

Marlett, Stephen A. 1992. Nasalization in Mixtec languages. Internation Journal of American Linguistics 58(4). 425–435. DOI:  http://doi.org/10.1086/ijal.58.4.3519777

McCarthy, John J. & Prince, Alan. 2004. Generalized alignment: prosody. Optimality theory in phonology: A reader, 165–177. DOI:  http://doi.org/10.1002/9780470756171.ch7

McCloskey, Jim. 1997. Subjecthood and subject positions. In Haegeman, Liliane (ed.), Elements of grammar, 197–235. Dordrecht: Kluwer Academic Publishers. DOI:  http://doi.org/10.1007/978-94-011-5420-8_5

McKendry, Inga. 2013. Tonal association, prominence and prosodic structure in South-Eastern Nochixtlán Mixtec. Edinburgh: The University of Edinburgh dissertation.

Mendoza, Inî. 2020. Syntactic sketch of San Martín Peras Tu’un Savi. Santa Barbara: University of California BA Thesis.

Mendoza Ruiz, Juana. 2016. Fonología segmental y patrones tonales del Tu’un Savi de Alcozauca de Guerrero. Mexico City: Centro de Investigaciones y Estudios en Antropología Social MA thesis.

North, Joanne & Shields, Jäna. 1977. Silacayoapan Mixtec Phonology. In Merrifield, William R. (ed.), Studies in Otomanguean phonology, 21–33. Summer Institute of Linguistics.

Odden, David. 2005. Introducing phonology. Cambridge, UK: Cambridge University Press. DOI:  http://doi.org/10.1017/CBO9780511808869

Ostrove, Jason. 2018. When phi-agreement targets topics: the view from San Martín Peras Mixtec. Santa Cruz: University of California dissertation.

Paster, Mary & Beam de Azcona, Rosemary. 2004. A phonological sketch of the Yucunany dialect of Mixtepec Mixtec. In Proceedings of the 7th annual workshop on American Indian languages, 61–76.

Penner, Kevin. 2019. Prosodic structure in Ixtayutla Mixtec: Evidence for the foot. Edmonton: University of Alberta dissertation.

Peters, Simon. 2018. The inventory and distribution of tone in Tù’un Ndá’vi, the Mixtec of Piedra Azul (San Martín Peras), Oaxaca. Santa Barbara: University of California MA thesis.

Peters, Simon & Mendoza, Inî. 2020. Morphophonological processes in Piedra Azul Tù’un Ndá’vi (Mixtec, San Martín Peras). Presentation at the winter meeting of the Society for the Study of the Indigenous Languages of the Americas.

Pike, Eunice V. & Cowan, John H. 1967. Huajuapan Mixtec phonology and morphophonemics. Anthropological Linguistics 9(5). 1–15.

Pike, Eunice V. & Oram, Joy. 1976. Stress and tone in the phonology of Diuxi Mixtec. Phonetica 33(5). 321–333. DOI:  http://doi.org/10.1159/000259780

Pike, Eunice V. & Small, Priscilla. 1974. Downstepping terrace tone in Coatzospan Mixtec. Advances in tagmemics, 105–134.

Pike, Kenneth L. 1947. Grammatical prerequisites to phonemic analysis. Word 3(3). 155–172. DOI:  http://doi.org/10.1080/00437956.1947.11659314

Pike, Kenneth L. 1948. Tone languages: A technique for determining the number and type of pitch contrasts in a language, with studies in tonemic substitution and fusion. University of Michigan Publications in Linguistics 4.

Prince, Alan S. 1983. Relating to the grid. Linguistic Inquiry, 19–100.

Prince, Alan & Smolensky, Paul. 1993/2004. Optimality Theory: constraint interaction in generative grammar. Malden, MA: Blackwell. DOI:  http://doi.org/10.1002/9780470759400. Revision of 1993 technical report, Rutgers University Center for Cognitive Science. Available online as ROA-537, Rutgers Optimality Archive, http://roa.rutgers.edu/.

R Core Team. 2013. R: A language and Environment for Statistical Computing. R Foundation for Statistical Computing Vienna, Austria. http://www.R-project.org/.

Rueda Chaves, John Edinson. 2019. La interacción entre el tono y el acento en el mixteco de San Jerónimo de Xayacatlán. Mexico City: El Colegio de México dissertation.

Rueda Chaves, John Edinson. 2021. La caracterización fonológica del grupo mixteco: 70 años de descripciones segmentales. Cuadernos de Lingüística de El Colegio de México 8. DOI:  http://doi.org/10.24201/clecm.v8i0.199

Stremel, Sophia. 2022. /i/ deletion in San Martín Peras Mixtec. Unpublished manuscript.

Uchihara, Hiroto & Mendoza Ruiz, Juana. 2022. Minimality, maximality and perfect prosodic word in Alcozauca Mixtec. Natural Language & Linguistic Theory 40(2). 599–649. DOI:  http://doi.org/10.1007/s11049-021-09517-y

Ussishkin, Adam. 2005. A fixed prosodic theory of nonconcatenative templatic morphology. Natural Language & Linguistic Theory 23(1). 169–218. DOI:  http://doi.org/10.1007/s11049-003-7790-8

Young, Lisa H. & Zajac, David J. & Mayo, Robert & Hooper, Celia R. 2001. Effects of vowel height and vocal intensity on anticipatory nasal airflow in individuals with normal speech. Journal of speech, language, and hearing research 44(1). 52–60. DOI:  http://doi.org/10.1044/1092-4388(2001/005)

Zylstra, Carol F. 1980. Phonology and morphophonemics of the Mixtec of Alacatlazala, Guerrero. SIL-Mexico Workpapers 4. 15–42.

Zylstra, Carol F. 1991. A syntactic sketch of Alacatlatzala Mixtec. Studies in the Syntax of Mixtecan Languages 3. 1–177.