Vowel contrast and neutralization, are often constrained to particular phonological contexts. For example, tense and lax vowels are contrastive in English, but crucially not word-finally; only tense vowels (except schwa) surface word finally in English. Barnes (2006) defines positional neutralization as any instance of an asymmetrical capacity of two positions (or sets of positions) to license phonological contrasts, such that one set of structural positions licenses a different (potentially larger) set of contrasts. The role of phonetics in positional neutralization and the licensing of contrast has been explored by many researchers (Flemming 1995; Steriade 1997; Padgett 2003; Barnes 2006). In this paper, I will be focusing on how the phonetic quality of laryngeal sounds can enable the positional licensing of a wider array of vocalic contrast.
The laryngeal licensing that is the focus of this paper comes in the form of apparent exceptionality in the surface realization of an underlying mid-high vowel contrast in Chamorro. In native forms this contrast is typically neutralized to either mid or high depending on prosodic position: mid vowels in closed stressed syllables and high vowels elsewhere (Topping & Dungca 1973; Chung 1983; 2020).
However, the contrast seems to be exceptionally preserved in several native forms (Chung 2020).
While the distribution of forms with an underlying mid versus high vowel is lexically conditioned, as the variation is not fully predictable and most words follow the normal neutralization pattern, there is a definite trend within the variation. The exceptionality of certain mid vowels will be shown to be patterned in the sense of Zuraw (2000; 2010), where the likelihood of a native form to preserve, rather than neutralize, an underlying mid vowel is greater than chance when the vowel precedes an intervocalic laryngeal consonant.
This paper will argue that laryngeal consonants in Chamorro provide an ideal phonetic environment for the perception of distinct cues to vowel quality. This laryngeal licensing has led to an instance of patterned exceptionality in the language, where an underlying vowel contrast is preserved rather than neutralized. This allowance of contrast is shown to be phonetically conditioned, and not due to prosodic structure. Overall, this surfacing of a mid-high contrast before laryngeals will serve as evidence of potentially language-specific asymmetrical phonetic licensing, bearing more generally on how positional vowel contrast is subject to the perception of distinct phonetic cues.
The paper is structured as follows. In §2, an overview of the integration of phonetics into the phonological licensing of contrast will be given, followed by an overview of Chamorro phonology, a description of the exceptional forms in Chamorro, as well as statistical confirmation that the exceptionality should be treated as patterned. Phonetic evidence in support of a licensing account based on formant cues for vowel quality will be provided in §3. In §4 the licensing will be formalized using dispersion theoretic constraints (Flemming 1995; Padgett 2003), and will be accompanied by a discussion of hiatus and its role in licensing. In §5, two alternative explanations for the patterned exceptionality will be considered, but promptly dismissed. Finally, remaining issues regarding other exceptional mid vowels in the language, the typological predictions of the account, and the role of phonetics in phonology will be discussed in §6.
Phonetic factors, both articulatory and acoustic, play a significant role in the preservation and neutralization of segmental contrasts. Some well known work that considers vocalic contrast and perceptual licensing is Flemming (1995)’s dispersion theory of contrast, which builds on earlier work by Lindblom (1986; 1990). The core of these theories is that when contrasts are developed in languages (through speaker interaction), there is a focus on three goals:
- Maximize the number of contrasts
- Maximize the distinctiveness of contrasts
- Minimize articulatory effort
This paper is concerned with only the first two goals, which seek to maximize the number of contrasts, while also maximizing the distinctiveness of these contrasts. Flemming argues that languages seek to maximize contrasts through preservation or enhancement of said contrasts, while also eliminating contrasts which are not sufficiently distinct in some environment (neutralization). Although questions of articulatory effort will not be considered in this paper, effects of articulation are still expected to be observable in the acoustics. This paper will treat the impact of articulation as something interpretable from the acoustics, and the relevance of any articulatory effects will be addressed by their resulting acoustic properties throughout this paper.
The maximization of contrasts often comes about through the preservation of already existing contrasts that would otherwise be neutralized in another environment. This tendency to maximize contrasts conflicts with the pressure to maintain maximally distinct contrasts, as the more vowels in an inventory that are contrastive, the less space between the vowels in said inventory. As Flemming discusses, this creates a situation where languages vary in what sort of compromise they will allow in the development of contrastive inventories, and what strategies they will employ to enhance, or reduce, contrast.
Enhancement as a strategy is seen through the modification of a sound or its environment to increase the perceptibility of cues in that environment. This is useful for increasing the distinctiveness of contrasts. For example, Flemming (1995) discusses contrasts between front and back vowels, where in some systems rounding is used to enhance the backness contrast. Front and back vowels are primarily differentiated by the value of the second formant (F2), with front vowels having a high F2, and back vowels having a low F2. Lip rounding lowers the value of F2, thereby enhancing backness.
- F2 comparisons
The contrast between front rounded and back unrounded vowels is thereby less distinct than that of front unrounded and back rounded vowels (4). The lowering of F2 through rounding brings front rounded vowel F2 values closer to back unrounded vowel F2 values, whereas the lowering of F2 through rounding brings back rounded vowel F2 values further from front unrounded vowel F2 values. The increase in distance of F2 in the latter leads to languages enhancing a front back contrast by employing rounded back vowels rather than rounded front vowels.
Contrast can not only be enhanced by articulatory or acoustic effects of the segment, but by altering the environment around the segment. Considering a contrast between forms like [pe] and [po], the vowels are primarily distinguished by the value of F2, and secondarily F3. However, this contrast can be enhanced by increasing the duration of the vowel’s transition through palatalization or labiovelarization respectively. By signaling a larger difference in formant transitions, as well as the difference in F2, ultimately [pje] and [pwo] provide better contexts for contrast than [pe] and [po] (Flemming 1995); this enhancement is seen in Nupe, for example (Hyman 1970, cited by Flemming (1995)). Duration of characteristic vocalic formants will play a significant role in the account provided here, especially when considering contexts which allow additional vocalic transition information to surface.
If the cues which signal a distinct contrast are not sufficiently realized, due to limitations of the phonological environment or a lack of enhancement, then contrast may be neutralized. Neutralization serves as a strategy to eliminate contrasts which are not distinct enough. For example, many dialects of American English restrict the appearance of a palatal glide immediately following a coronal, while ‘Received Pronunciation’ (RP) English allows this sequence (Flemming 1995). This results in RP English having a distinct phonological contrast for dew [dju] versus do [du], while Standard American English speakers have no contrast of the sort. For Standard American English, neutralization of this potential contrast occurs due to the F2 of the palatal glide not being distinguishable enough from the F2 of the plain coronal, which already conditions a fairly high F2 at the onset of the following vowel.
Researchers besides Flemming have proposed similar approaches to contrast preservation and neutralization, especially from the perspective that acoustic and perceptual cues guide the licensing of contrasts. Steriade (1997) argues that cues are directly referenced in the phonology, to account for the preservation or neutralization of voicing contrasts for obstruents. Steriade shows that positions of poor perceptibility for obstruent voicing contrasts (e.g. before other obstruents and word-finally after an obstruent) correlate with neutralization of said contrast. Steriade attributes neutralization to a single factor, relative poverty of cues, which implies that the more perceptible cues a context possesses, the greater likelihood of preserving contrast. Under Steriade’s approach, there is significant reference to the cues used by speakers, and this may be an area of cross-linguistic variation. Phonetic implementation can thus be studied and evaluated from language to language to determine how it shapes neutralization.
One constant in the work on contrast and neutralization is the idea that contexts are more or less helpful for the perceptibility of a contrast. For example, Barnes (2006) discusses how final syllables are potentially strong positions for contrast preservation due to final lengthening. Additional duration in a final syllable can provide a superior environment for the realization of contour tones usually reserved for more sonorous positions, and may also license a full range of vocalic contrast in languages like Hausa and Pasiego Spanish (Barnes 2006). Barnes notes that these licensing cases are less common than final syllables merely resisting some reduction or assimilation process. Final syllables are just one example of an environment that facilitates greater perception of cues, often resulting in the preservation of contrast. It will be shown that before intervocalic laryngeal consonants in Chamorro, phonetic context leaves vocalic formant information unperturbed, providing clearer information about the vowel quality than what typical intervocalic supralaryngeal consonants provide.
The language of focus, Chamorro, presents a unique environment for vocalic contrast; namely, open syllables before laryngeal consonants. Chamorro is an Austronesian language spoken by approximately 60,000 in the Mariana Islands, as well as in various communities throughout the United States. Chamorro has a fairly standard stress pattern and inventory of vowels and consonants when compared to other Austronesian languages (Topping & Dungca 1973).
Among the native vocabulary, stress is for the most part regular. For the majority of native roots, stress falls on the penultimate syllable. Very few native forms find stress falling on the antepenultimate syllable, or word-finally (Topping & Dungca 1973; Chung 2020). Stress appears to condition syllable weight in Chamorro, as all stressed syllables have either a long vowel, or are closed by a coda consonant (Chung 1983).
- ‘to make’
- ‘slink away’
There are 21 consonants in the inventory, with two laryngeal consonants: [ʔ] and [h] (Topping & Dungca 1973; Chung 1983; 2020). Consonant clusters are possible in the language, evidenced in (5). Glottal stop can appear intervocalically in roots as a contrastive segment (6-a). Glottal stop may also appear contrastively in coda position in word-medial (6-b) and word-final position (6-c). Additionally glottal stop may also appear as an epenthetic segment between vowels at prefix-stem boundaries (6-d). An elaborated argument for epenthetic, rather than underlying, glottal stop is found in Chung (2020); some of the evidence includes a lack of minimal pairs with vowel initial and glottal stop initial forms, a lack of word-initial glottal stop when the vowel-initial syllable is reduplicated, and the optional appearance of glottal stop following sonorant-final prefixes attached to vowel-initial stems.
- Glottal stop distribution
- /mi + attsuʔ/
- ‘rocky’ (Revised Chamorro-English dictionary database)
[h] appears contrastively in word-initial (7-a) and word-medial (7-b) positions, but never appears word-finally. When word-medial in a consonant cluster, [h] appears as both an onset and a coda, (7-c) and (7-d) respectively. Notably, within the Saipan and Guam dialects of Chamorro, speakers variably pronounce [h], often resulting in assimilation to a preceding consonant or categorical deletion (Chung 1983; 2020). The specific behavior of [h] and its effect on the transparency of the preceding vowel distribution will be discussed in greater detail in §3 and §4.1.
- [h] distribution
- ‘stand up’
The surface vowel inventory of Chamorro consists of six vowels, (8). Low vowels under stress surface as either [a] or [ɑ]. In unstressed syllables, the low vowels neutralize to [a], which is affected by the backness of adjacent segments. A surface generalization for the distribution of mid and high vowels is stated in (9).
While this surface generalization seems to hold constant for the majority of native forms in Chamorro, there exist forms which suggest that the underlying inventory consists of contrastive mid and high vowels.
- ‘star-shaped rice-cake’
Previous descriptions of the language suggest Chamorro began with a three vowel system /i,u,a/ that was expanded through the introduction of loan words during Spanish colonization (Topping & Dungca 1973; Blust 2000). Further discussion of the development of the vowel inventory may be found in later sections, but here I adopt the hypothesis that modern Chamorro has an underlying six vowel system.
While the language may have a six vowel system underlyingly, the appearance of mid vowels has been attributed to an active process of vowel lowering (Chung 1983; Crosswhite 1998). Evidence for this comes from two main sources: regular alternations of both lowering and raising in native forms, and a regular process of raising in loan word adaptation. Both sources show that there is, in the most conservative interpretation, a phonotactic restriction on the appearance of mid vowels outside of closed stressed syllables.
Regular vowel lowering and raising alternations are seen primarily in suffixation, where stress predictably shifts due to the preference for penultimate stress in the language. When stress shifts to a closed syllable that contains a high vowel, that vowel is obligatorily lowered to the corresponding mid vowel, as shown in (12-a) and (12-b). This lowering is predictable and categorical, and is not affected by speech rate or register. Lowering does not occur if the syllable that becomes stressed is open, confirming that both prosodic conditions must hold for lowering to occur.
- Raising and lowering in native words
- [gék.pu] ‘flyer’ → [gik.pók.ku] ‘my flyer’
- [mét.gut] ‘strong’ → [mit.gót.ɲa] ‘stronger’
- [néː.ni] ‘baby’ → [ni.níː.hu] ‘my baby’
- [tsóʔ.gwi] ‘to do’ → [tsuʔ.gwíː.ji] ‘to do for’
Alongside lowering, raising reliably occurs when a mid vowel in a closed syllable is destressed. Chamorro has a restriction on secondary stressed syllables immediately preceding primary stressed syllables, and removes stress completely from the secondary stressed syllable to avoid this clash of prominence. The loss of stress creates an unstressed closed syllable, and prompts raising of the mid vowel, as demonstrated by the behavior of the first vowels throughout (12). This alternation has been characterized as optional by previous researchers (Chung 1983; 2020), with the optionality being contingent mostly on speech rate and register. The optionality will not be discussed in explicit detail here, as the availability of raising alone suggests a repair in response to a phonotactic restriction.
The second source of evidence for regular alternations determining the appearance of mid vowels is from loan word adaptation. Chamorro has a significant amount of borrowed words; the majority being from Spanish, with a number of other loans from Japanese, English, and other Austronesian languages. Most loans have been nativized enough to follow native phonological patterns. Stress is often preserved from the source language, but Chamorro phonotactic restrictions are followed fairly regularly. This bears on the distribution of mid vowels in the following way: mid vowels outside of stressed syllables are raised to their corresponding high vowels.
- Raising in loan words
- [hóː.dzu] < Spanish [ójo]
- [béː.lu] < Spanish [bélo]
- [kάt.ni] < Spanish [kárne]
- [lén.ti] < Spanish [lénte]
While the occurrence of mid vowels in stressed open syllables in (13-a) and (13-b) might concern the reader at first, it is a common assumption that stressed syllables, being positions of prominence, can require more strict faithfulness (Beckman 2004). This stricter faithfulness appears to hold for most loans, but does not apply to native Chamorro forms. This pattern of preservation under stress is also seen for high vowels in closed stressed syllables, which should typically lower if fully nativized but have remained high (14).
- [mís.mu] < Spanish [mísmo]
- [pún.tu] < Spanish [púnto]
These apparent exceptions may also reflect the lexical stratification of the Chamorro vocabulary (Ito & Mester 1995). Depending on the level of nativization seen in a loan, it is reasonable to posit that the loan exists within a different phonological stratum than that of native forms or other loans. In this way, the appearance of mid vowels in open stressed syllables, and high vowels in closed stressed syllables, in most loan words is not surprising. What is important is the systematic raising of mid vowels in unstressed syllables for loan words. This regular alternation provides further support to the claim that mid vowels in native forms are the result of an active process of vowel lowering, and that [e o] are restricted in short syllables.
Mid vowels, the result of phonological vowel lowering, appear to be in complementary distribution with high vowels throughout the native vocabulary. However, there is crucial evidence to the contrary that this paper seeks to address. There is a tendency, previously noted by Chung (2020), for exceptional mid vowels to occur in open stressed syllables before intervocalic laryngeals. I will be setting aside loan words in the exploration of this pattern; further discussion of loans can be found in §6.1.
- Native roots with mid vowels before intervocalic laryngeals1
- ‘wrapping, bandage’
This tendency itself could just be an arbitrary specification by the grammar, as one might be tempted to believe given the label “exceptional”. Exceptionality is not always arbitrary though, and may instead be subject to regular conditioning. Patterned exceptionality (Zuraw 2000; 2010) has been identified in many languages, where some amount of regularity can be found within cases of apparent exceptionality. In Tagolog, when a nasal-final prefix attaches to an obstruent initial stem, there are two possibilities for what happens to the nasal. The first option is place assimilation to the obstruent. This pattern is what typically happens, and is the same typical behavior that occurs when a nasal-final prefix is attached to a sonorant-initial stem. The second option, subject to lexical variation, is a process of nasal substitution. Nasal substitution causes the final nasal of the prefix and the initial obstruent of the stem to both be replaced by a nasal that is homorganic to the stem-initial obstruent. Although nasal substitution is more variable and subject to various restrictions, the distribution of when substitution occurs is by no means even. Within this pattern, Zuraw identifies reliable skews in the distribution. Substitution is more likely when the obstruent is voiceless, and the likelihood of substitution increases based on place of articulation (bilabial >> alveolar >> dorsal). Thus the regularity found within nasal substitution is reasoned to be an instance of patterned exceptionality. Returning to Chamorro, the occurrence of mid vowels before laryngeals being patterned rather than arbitrary can be evaluated empirically. I will bring quantitative evidence to bear on the question of whether the occurrence of mid vowels outside of the canonical environment is dependent on the occurrence of intervocalic laryngeals.
I employed a chi-squared (χ2) test to address this intuition statistically. Using the Revised-Chamorro English dictionary database, I extracted all bi-syllabic native roots which contained an intervocalic laryngeal or supralaryngeal consonant, with a preceding high or mid vowel. Native roots are overwhelmingly bisyllabic, with some exceptions including mono & trisyllabic forms. All loans were excluded from the set of forms. Suspected loans were compared to Spanish, Japanese, and English forms with the same or similar definition, and were confirmed as a loan by whether the phonological shape of the word could be achieved via alternations conforming to native Chamorro phonotactic restrictions. For example, a form like katni [kάtni] meaning meat in Chamorro was compared to the word in Spanish meaning meat, carne [kárne]. The vowel differences align with normal alternations conforming to phonotactic patterns in Chamorro, as well as the consonantal differences ([r] often becoming [t] in Chamorro). A sample of the forms selected is provided in (16).
A chi-squared test is used here to evaluate whether the observed frequency of mid vowels before intervocalic laryngeals is due to mere chance, or if vowel quality and the presence of a laryngeal are interacting factors which warrant some further investigation.
- Example forms taken from the Revised Chamorro-English dictionary database for χ2-test
- (laryngeal x mid vowel)
- (supralaryngeal x mid vowel)
- (laryngeal x high vowel)
- (supralaryngeal x high vowel)
Forms were extracted from the Chamorro dictionary database and used to construct a table with the observed number of forms with mid and high vowels that also have either an intervocalic laryngeal, or supralaryngeal consonant, Table 1. A similar table with the “expected values” (the values that would be found if vowel height and presence of a laryngeal were independent of each other) was also constructed.
|Mid Vowel||High Vowel||Total|
The expected frequencies in Table 2 are filled in proportionally based on the observed frequency totals (notice that the totals in both tables are the same). So, in total, 77/501 = 15% of the forms have an intervocalic laryngeal, and 19 (17% of 113) of the 113 forms with a mid vowel should have a following intervocalic laryngeal.
|Mid Vowel||High Vowel||Total|
Inspecting the tables visually, it appears that forms with an intervocalic laryngeal and preceding mid vowel are more common than expected. However, this must be quantified statistically to be sure that these variables do indeed interact. To test the significance of the differences between the observed and expected values, a χ2 value is calculated, which is the sum, for all table cells (excluding totals), of (observed-expected)2/expected. This results in χ2 = 10.9, and ultimately p < .01 with degrees of freedom = 1.
A significant test for dependence suggests that the occurrence of mid vowels before intervocalic laryngeal consonants appears to have a level of regularity, where the vowel quality is dependent on the occurrence of a laryngeal consonant. This confirms the suspicion, previously expressed by Chung (2020), that the lexically conditioned exceptionality of these forms is presumably also patterned. The next step in this paper will be to consider possible explanations for this patterned exceptionality. Notably outside this patterned exceptionality are occurrences of mid vowels preceding supralaryngeal consonants; these will be discussed separately in §6.1.
3 Phonetic evidence for perceptual cues licensing the mid-high contrast
I turn now to an account based on the perceptibility of cues to explain the patterned exceptionality of the mid-high contrast before laryngeals. This licensing of mid-high contrast will be shown to stem from the availability of clearly perceptible vocalic formant information through the laryngeal gesture. The availability of vocalic formant information in the laryngeal environment differs noticeably from that offered by the supralaryngeal environment. It’s this difference that drives the patterned exceptionality.
Recall that a chi-squared test for independence confirms the suspicion that there is a higher occurrence of mid vowels in open syllables before laryngeals than expected. Before explaining why this contrast arises exceptionally in this specific position, the possible cues for this contrast must be identified and understood. The focus here is on a height contrast between mid and high vowels. Acoustically, vowel height is determined primarily through the value of the first formant (F1); the lower the F1 value, the higher the vowel.
The use of formants as distinctive acoustic cues for vowel quality is instantiated both in the steady state of the vowel, and the dynamic transitions in and out of vowels. A vowel’s steady state is usually characterized as the portion of the vowel acoustically which maintains consistent formant structure around the target or midpoint vowel formant values. The transition is the result of articulators moving towards a different target, resulting in different formants, which creates a transition period acoustically between one sound and the next. Previous work has shown that perceivers use both the vowel steady state and more dynamic transitional information as cues for vowel identification (Strange 1989; Ohde & German 2011).
The difference between supralaryngeal and laryngeal consonants is the former comes with an articulatory target, and a resulting formant target, while the latter provide no vocal tract modulation, and thus lack formant targets. Supralaryngeal consonants will always have a characteristic formant transition as formants move from the vowel’s target formants to the following consonants (Stevens 1989; Johnson & V. C. Sherman & S. G. Sherman 2011). However, laryngeal consonants do not have formant targets (Keating 1988; Bessell 1992; McCarthy 1994; Borroff 2007), as they do not modulate the vocal tract; though see discussion in §5.2 on ways laryngeals may modulate upper laryngeal musculature. Because of this lack of modulation of the vocal tract, vowels preceding laryngeals will not show formant transitions into the laryngeal. The lack of formant transitions results in the preservation of vocalic formant structure through the laryngeal. When the laryngeal does not share the same formant structure with either the preceding or following (supralaryngeal) sound, its formants are always transitions from the preceding vowel into whatever oral sound follows the laryngeal consonant.
This lack of target place of articulation, and corresponding lack of formant perturbation, allows the preceding vowel’s F1 to continue longer, into the laryngeal. I argue this continuation of vocalic formant structure serves as an external cue, i.e. a cue not during the vowel itself, to vowel height contrast. External cues are not exclusive to laryngeal consonants, but external cues present on supralaryngeal consonants will have more to do with noise than with vocalic formants. For example, the burst from an oral stop [t], or the noisy frication during [s], can both provide cues to frontedness (Holt & Lotto & Kluender 2000; Rysling & Jesse & Kingston 2019). While these are examples of external cues, they are not ones directly pertaining to F1. I will argue that the persistence of F1 through laryngeals serves as an external cue to vowel height, thereby providing better perception of the mid-high vowel contrast.2
3.1 Data collection & segmentation criteria
As this paper makes significant use of acoustic data, I’ll briefly cover how the data was collected and analyzed. All of the Chamorro data, unless otherwise specified, is the result of work with two native speaker consultants residing in Northern California. Both speakers are adult males; one born in Guam, the other Saipan, and both now live primarily in Northern California. Both speakers moved to California in their teens, and actively use Chamorro with immediate family members in the home. The data itself was collected through multiple elicitation sessions, where the consultants were asked to translate words, phrases, and sentences from English into Chamorro. Consultants were also often asked to define a Chamorro word, retrieved from the Revised Chamorro-English dictionary database, and use that word in a larger sentence. Each session was recorded using a headset microphone, with an H1 Zoom recorder, in quiet conditions.
The data was analyzed using PRAAT (Boersma & Weenink 2022) and was segmented by hand. Each individual sound across tokens was segmented in accordance with the following criteria for determining boundaries of the segment based on typical acoustic correlates for the identification of individual sounds (Stevens 1989; Johnson & V. C. Sherman & S. G. Sherman 2011).
- Segmentation criteria
- Supralaryngeal stops: begins at offset of all higher frequency components; ends at release burst
- Supralaryngeal fricatives: begins at onset of clear frication noise relative to the fricative (e.g. very high frequency noise for sibilants); ends at frication noise offset
- Sonorant consonants: begins at point of amplitude change, and/or frequency shifts relative to the consonant (e.g. lowering of F3 signalling the onset of [l], or a reduction in amplitude signalling the onset of a nasal); ends at opposite point of these amiplitude or frequency changes
- Glottal stop ([ʔ]): begins at onset irregular voicing, signaled by aperiodicity in the waveform, a decrease in overall energy across all frequencies, increased distance in glottal pulses seen in the spectrogram, and a reduction in intensity; ends at the onset of regular, periodic voicing if followed by a vowel, or the onset of associated acoustic correlates for a following consonant
- Glottal fricative ([h]): begins at the onset of noisy frication and apparent formant structure with reduced amplitude across frequencies, suggesting formant excitation is the result of noise and not periodic voicing; ends at the onset of regular, periodic voicing if followed by a vowel, or the onset of associated acoustic correlates for a following consonant
- Vowel: begins at onset of complex, periodic voicing with higher frequency components; ends at the offset of periodic voicing with higher frequency components. Onset and offset will also be determined relative to the onset and offset of correlates for other sounds
- Hiatus: hiatus (vowel-vowel sequences) are identified when no formant dropout is apparent between the adjacent vowel sounds. Segmentation is done with respect to the end of the plateau in the second formant (F2), making the segmentation align with the zero point of the waveform
It should be noted that this set of segmentation criteria does not necessarily control whether vocalic offset transitions, those transitions from the offset of a vowel into a following consonant, are included as part of the vowel, or as part of the following consonant. This is because using criteria such as the offset of energy in higher formants and a drop in intensity are not necessarily consistent across all tokens. Vowel durations reported in this paper should be understood as including the steady state portion of the vowel and any vocalic offset transitions that would be the result of segmentation consistent with the segmentation criteria in (17).
All acoustic measurements were determined using PRAAT, through the use of various PRAAT scripts.
3.2 Phonetic evidence
It’s been suggested in this paper that acoustic cues to vowel height should be more perceptible when preceding an intervocalic laryngeal, due to laryngeals’ lack of articulatory target allowing the persistence of longer vocalic F1 into the laryngeal serving as an external cue to vowel quality. The acoustic evidence for vowel quality, specifically that of F1 formant structure through the laryngeal interval, assists in better perceiving vowel height contrast. Evidence for this claim in Chamorro can be seen in the spectrograms provided for the forms te’uk [téːʔuk] and bohan [bóːhan], Figures 1 and 2 respectively.
The spectrogram for te’uk in Figure 1 shows how laryngeals provide an environment with little to no perturbation of vocalic formant information. As has been noted, the intervocalic glottal stop has no full glottal closure, instead being realized as creakiness. Regardless of the extent of glottal closure, no vocal tract modulation is found that would lead to some modulation of vocalic formant structure. A smooth transition of both F1 and F2 from [e] to [u] through the laryngeal gesture is observed. What this means is Chamorro laryngeal consonants essentially inherit the formants from the adjacent supralaryngeals, in this case the vowels. The interpolation of these formants has been highlighted in the spectrograms for the benefit of the reader, and this sort of interpolation of formant structure from adjacent vowels has been demonstrated previously for other languages (Keating 1988). The persistence of vocalic formants through the laryngeal, most importantly F1, provides ideal conditions for the perception of both steady state and transition cues for vowel height. Results are similar for bohan with an intervocalic [h], as observed in Figure 2. There is a smooth transition from [o] to [a], with both F1 and F2 being clearly interpretable from the acoustic information.3 The laryngeal gesture of [h], characterized by more noise in the spectrogram and an overall decrease in intensity seen in the waveform, allows more information regarding the dynamic transitions for the preceding vowel to appear.
To further demonstrate the point that laryngeals provide no formant targets of their own, Figure 3 shows transitions between two identical vowels.4 As Figure 3 shows, there is no discernible change in the formants through the laryngeal [h], which is consistent with the evidence shown in Figures 1 and 2. The acoustic features of laryngeals, as well as the vowel being stressed, creates a unique environment with respect to other phonetic contexts in the word, allowing distinct perceptual cues external to the vowel to surface for the perception of a mid-high contrast.
To understand fully why vowels adjacent to laryngeals are more perceivable, intervocalic laryngeals must be compared to intervocalic supralaryngeal consonants. Ultimately, supralaryngeal consonants do not permit the same persistence of vocalic formants through the consonantal gesture. This results in less acoustic information about the transition between vowels. This is due to supralaryngeal consonants having their own target in the oral tract for articulation, changing the filter, and thereby the resonance, that determines formant values. Supralaryngeal consonants will effectively suppress characteristic vocalic formant structure and produce their own acoustic profile. While transitions from a vowel into a supralaryngeal consonant serve as useful internal (i.e. during the vowel) cues in vowel perception (Strange 1987), there is a clear asymmetry in the information provided by the laryngeal environment compared to the typical supralaryngeal context.
The spectrograms for vowels adjacent to supralaryngeal consonants in Figure 4 show that the environment suppresses any potential continuation of vocalic formant information. Although one could posit that this type of suppression would only appear for stop consonants, the representative example in disu’ [díː.suʔ] shows that even without full oral closure, there is no interpretable vocalic formant structure outside of the initial transition from the vowel into the gesture for [s]. Supralaryngeal consonants may still provide coarticulatory cues which are external to the vowel, such as differences in spectral center of gravity in the determination of [s] frontedness. However, these external cues often don’t pertain to F1, instead pertaining to spectral noise. With F1 being the most informative cue to vowel height, a continuation of F1 through a consonant should serve as an informative external cue for vowel height. Returning to Figure 4, clear formant structure for the F1 and F2 of [i] is observed, but this formant structure is nullified after the transition into [s]. One might observe some visible F2 throughout the duration of [s]; this can be attributed to the closure made by [s] being quite forward in the mouth, and so a maintenance of high F2 is consistent with the place of articulation for [s], as well as the preceding [i]. The main point here is the complete cessation of F1, showing no clear F1 external cue for the interpretation of the vowel through the supralaryngeal gesture. This contrasts with the formant behavior through laryngeal gestures, which maintain very clear acoustic cues, in the form of external cues to F1, to the preceding vowel quality. The cessation of formants is also seen in the production of intervocalic supralaryngeal stop consonants, like in tugi’ [túː.giʔ] (Figure 5).
The spectrogram shows some F2 movement which is typical of velar consonants (characterized by a raising of F2 and lowering of F3). This “velar pinch” is a cue to the velar place of the following consonant, and is an example of a cue that is external to the vowel, but internal to the adjacent velar consonant. In comparison, because there is cessation of F1 during the supralaryngeal articulation, there is no presence of F1 through the consonant to serve as an external cue to vowel height.
The perturbation of F1 is clear for obstruent supralaryngeals, but is maybe less noticeable for sonorants. However, sonoroant supralaryngeals still have articualtory targets which shape the vocal tract, resulting in differential formant structure from the preceding vowel. This is overwhelmingly apparent for the behavior of F3 leading into and out of the [l] in ulu’ [úː.luʔ], shown in Figure 6.
While a shift in F1 is difficult to see visually in Figure 6, some shift in F1 is still likely depending on the preceding vowel. Furthermore, the F1 during the lateral should still be attributed to the lateral itself rather than a continuation of vocalic F1. This is supported by the fact that F1 can assist in differentiating laterals from nasal consonants (O’Connor et al. 1957), and is overall supported by the fact that [l], being a sonorant consonant, still predictably shapes the vocal tract resulting in formant structure used by perceivers in consonant identification (Miyawaki et al. 1975).
It appears that no matter the supralaryngeal consonant, there is no persistence of vocalic formant structure through the supralaryngeal consonant that could be used as an external cue to the preceding vowel.5
4 Formal account
Acoustic evidence suggests that vowels before intervocalic laryngeal segments are more easily distinguished than vowels before supralaryngeal consonants. The implication of this increased perceptibility is that the presence of an intervocalic laryngeal licenses a mid-high vowel contrast in a position that normally triggers neutralization of mid vowels to high. This contrast preservation can be accounted for using constraints on the licensing of contrast in these various positions. This account is able to efficiently capture the fact that any underlying mid and high vowels will be preserved before intervocalic laryngeals, while being neutralized elsewhere due to the lack of cues to contrast. I choose to use a dispersion-theoretic approach similar to that of Flemming (1995) and Padgett (2003) for this account. Note that nothing in principle hinges on the formal machinery for capturing this licensing: an account using implementational constraints like those of licensing-by-cue would also be able to capture the licensing of contrast before laryngeals (Steriade 1997; 2008).
The main principles of dispersion theory can be summarized as the maximization of maximally distinct contrasts which are also articulatory easy. I will forego discussion of articulatory ease, as it holds little significance in the consideration of this contrast. In capturing the tendency to maximize the number of vowel contrasts for an inventory or particular environment, I appeal to a constraint which prevents the neutralization of contrasts. This constraint, labeled NoMerge, is defined in (18).
- NoMerge: Assign one violation for every neutralized vowel contrast.
The constraint NoMerge serves the function of limiting neutralization to maximize the number of contrasts. Alongside this constraint is another to capture the relative markedness of mid vowels in Chamorro. Mid vowels, both in Chamorro and typologically, are more marked than vowels on the periphery of the vowel space (J. Beckman 2004; Gouskova 2012). Peripheral vowels, [i u a], are characterized by more extreme articulations relative to other vowels, making them good targets for maximizing distinctiveness of contrast (Crosswhite 1998). Specific to native words in Chamorro, mid vowels always neutralize to the corresponding high vowel outside of the environments which they are licensed; a Peripheral constraint (cf. J. Beckman (2004) for the similar NoMid constraint) therefore captures the typological trend of preferring distinct vowels, and the language-specific pattern seen in Chamorro.
- Peripheral: Assign one violation for every non-peripheral vowel [e o].
To capture the need for maximally distinct contrast, Flemming (1995) posits a ranked set of constraints which reference the perceptual distance between vocalic cues; specifically between formant levels. For this constraint formalization, minimum distance will be measured along a single dimension, F1, corresponding to vowel height. To represent more abstractly the perceptual distance that is controlled, levels for F1 corresponding to vowel height are posited in (20).
The particular values in (20) are arbitrary. These could, in theory, be computed from specific acoustic values for F1 in Hertz that are averaged across speakers; this would require many productions from several speakers. However, conceptually, these levels could be translated into more abstract levels of representation when encoding of acoustic information occurs, so the specification of exact values is unnecessary.
With specified values of F1 corresponding to vowel height, a constraint capturing the necessity for a particular distance between levels can be posited. The goal of the constraint is to ensure that the perceptual distance between formant levels is distinct enough to be correctly interpreted by speakers, resulting in a viable contrast. The minimum distance constraint relevant to this analysis is specified as follows:
- MinDist:F1:2: Assign one violation if the distance between F1 levels is less than or equal to 2.6
All constraints evaluate inventories of contrasting sounds in a specific context, rather than evaluating complete words (Flemming 1995). Using this set of constraints a potential vowel contrast in a supralaryngeal environment is correctly neutralized, as shown in Table 3. The ranking needed for Chamorro prioritizes having a sufficient distance between segments in the inventory over maximizing contrast. This results in a crucial ranking of MinDist:F1:2 above NoMerge; Peripheral holds no crucial ranking, but correctly neutralizes the potential contrast to the more peripheral high vowel, in this case [i].7
The mid/high distribution within a supralaryngeal environment is handled very simply as shown in 3. However, to correctly handle laryngeal environments and their unique licensing, an additional component of the formal account must be introduced. Evidence in §3 has shown that intervocalic laryngeals facilitate greater perceptibility of vocalic cues compared to intervocalic supralaryngeal consonants. As there is a clear asymmetry in the vowel contrasts licensed in these two environments, a factor must be encoded that can augment the perceptual distance between F1 levels to ensure that the contrast is sufficiently more distinct for laryngeal contexts. This will also capture the fact that intervocalic supralaryngeal consonants do not positively affect the distinctiveness of the height contrast, and therefore provide no augmentation of the perceptual distance between F1 levels. Here I will posit that a scaling factor is mapped onto output candidates. This scaling factor is dependent on the consonantal context which serves as the environment for the vowel contrast. Mechanistically, these scaling factors will apply to the perceptual distance between the F1 levels for the vowel contrast in the generated output candidate. This allows the perceptual distance to change from the underlying distance in F1 levels, and will be contingent on the phonetic environment in which the vowel is being realized. By positing that the scaling factor is present on output candidates, this becomes a claim not about the evaluative constraint, but about the specific phonetic context.
Inherently, supralaryngeal consonant environments will carry a non-scaling factor: a multiplier of 1. This scaling factor essentially represents that supralaryngeal consonant environments will transparently realize the distance between F1 levels, and will neither augment nor hinder the perceptual distance. The scaling factor was already at work in Table 3, where the perceptual distance of 2 was not sufficient to satisfy the minimum distance constraint and as a result, neutralization was correctly obtained. The supralaryngeal context crucially contrasts with the intervocalic laryngeal context, which as shown previously, provides greater perceptual cues for interpreting vocalic contrast. Because of this, the intervocalic laryngeal environment will carry a scaling factor of 1.5. The value of 1.5 is an estimate, based on the typical amount of F1 information that laryngeals provide over and above that provided by the vowel itself, demonstrated by Figures 1, 2, 3, 4 in §3. The value need only be greater than 1 in order to ensure that intervocalic laryngeal environments license superior perceptual cues compared to intervocalic supralaryngeal consonants. This also enables the correct formulation of MinDist presented in (21).
The tableau in Table 4 shows an underlying vocalic contrast before an intervocalic laryngeal consonant. Because the context involves a following intervocalic laryngeal, the distance has been scaled by 1.5. This results in a distance crucially above 2, which satisfies the MinDist constraint. Notice here that NoMerge must be crucially ranked above Periph in order to correctly allow for non-peripheral vowels to be contrastive. Because of the crucial rankings, MinDist over both NoMerge and Periph, candidate (a), which preserves the [i] – [e] contrast, wins over candidates (b) and (c). Candidates (b) and (c) both incorrectly neutralize a licensed contrast permitted by the context, making them inferior candidates overall.
It should be noted that while intervocalic laryngeals seem to license the appearance of both mid and high vowels, the same cannot be said of word-final laryngeals. In Chamorro, only glottal stop may occur word-finally; /h/ is banned from word-final positions (Topping & Dungca 1973; Chung 2020). A simple solution to this lack of mid-high vowel contrast before word-final glottal stops is to appeal regular contrast reduction. Recall that only peripheral vowels /i, u, a/ are allowed in unstressed syllables in Chamorro. Stress in native roots is also typically penultimate, ensuring that word-final syllables with a glottal stop coda are in fact unstressed. This means the lack of mid vowels before word-final glottal stop is entirely predictable, and is simply accounted for by the regular neutralization of vowels in unstressed syllables, as demonstrated in §2.1.8 This also entails that all unstressed environments can simply be accounted for by the typical neutralization pattern in the language. However, the account presented here can also adequately handle the the fact that unstressed, short, /i e/ is not permitted. This case is shown in Table 5, where a potential contrast of short, unstressed /i e/ is neutralized to /i/ due to an insufficient perceptual distance to satisfy the MinDist constraint.
The formal account presented here captures the asymmetry between intervocalic laryngeals and intervocalic supralaryngeal consonants in the licensing of vocalic contrast. A vowel height contrast is preserved only when the minimum perceptual distance between the relevant cue (F1) is obtained, and this is really only obtained when the context can sufficiently allow both the steady state and dynamic transition portions of vocalic formants to be realized acoustically. The laryngeal environment provides a more robust phonetic context to realize these cues when compared to the supralaryngeal environment.
Native forms exhibiting patterned exceptionality have been accounted for through direct reference to acoustic cue information, explaining the licensing of a mid-high vowel contrast. Before laryngeal consonants however, these forms also exhibit free variation between acoustically overt laryngeals and hiatus.
Observing the spectrogram and waveform for two instances of the form bohan [bóːhan], a clear difference is seen in the presence of acoustic correlates for [h] between the items. In Figure 7, higher formants appear weaker due to their excitation by noise rather than voicing. This coincides with a reduction in intensity of the waveform. These observations present clear acoustic evidence for the presence of [h]. However, the spectrogram in Figure 8 does not exhibit either of these qualities, strongly suggesting that there is no breathiness being produced by the speaker. The same type of hiatus is observed for the form seha ‘back up’ in Figure 9, where there is no acoustic correlate for breathiness, but a clear formant transition from [e] to [a] excited only by voicing.
The free variation exemplified by the contrast in Figures 7 and 8 presents several interesting questions for Chamorro phonology, and this account specifically. The first is, how should the behavior of the laryngeal and the free variation of these forms be accounted for? This raises the question of whether the absence of laryngeals in these forms is due to some variable phonetic realization, or whether this phenomenon is more categorical. The former wouldn’t raise any red flags for the account presented in this paper, and would actually align well with the typological trends that laryngeals are generally more phonetically transparent than other segments (Borroff 2007). The latter possibility is more of a strict puzzle for the phonology, as hiatus is prohibited in Chamorro, as evidenced by hiatus resolution such as glottal stop insertion and allomorphy of suffixes (Chung 2020). This would also further obscure the exceptional vowel contrast already present in this position. Deletion of the intervocalic laryngeal would create further opacity, as the typical licensing environment would be removed from the surface form. Here I will walk through how hiatus is treated in Chamorro in general, and present some evidence and arguments for why hiatus may be produced variably in this position. Ultimately it will be shown that forms exhibiting hiatus provide the same phonetic licensing as forms with laryngeals, and are thereby captured by the same account presented in §4.
Typologically, hiatus is often not permitted due to languages following the Onset Principle, which has languages seeking an onset for every syllable. Examining Chamorro’s phonotactic restrictions, it seems that the language actively and pretty rigidly follows this tendency. Word-medially there is no example of hiatus in native Chamorro forms (Topping & Dungca 1973; Chung 2020). Hiatus that would be produced at morpheme boundaries is repaired through various means; the most prominent being the insertion of glottal stop between prefixes and roots, and the use of consonant-initial suffix allophones. Glottal stop insertion is most readily seen between prefixal reduplicative morphemes and roots in Chamorro, as in (22-a). An allophone for the transitivizing suffix -i is -yi [-dzi], as in (22-b), which only occurs when vowel final stems are transitivized (Topping & Dungca 1973; Chung 2020).
- /RED + akunseha/ → [á.ʔa.kun.se.ha]
- /sɑga + i/ → [sa.gά.dzi]
With these facts in place, it is hard to see why hiatus would be the result of a phonological process of deleting the underlying laryngeal segment. However, there is some suggestive data to show that at least [h] is being lost as a completely realized segment in the output for speakers. Let us recall from earlier in the discussion, Chung (1983; 2020) notes that /h/ is assimilated with an adjacent consonant, or categorically deleted in most contexts for speakers of the Saipan and Guam dialects. This results in a surface-level opaque distribution of mid vowels, as a form like /tuhgi/, ‘stand up’, is realized as [tóː.gi] without any phonetic evidence of [h]. However, phonotactic conditions on the distribution of mid vowels for these speakers still appear to hold.
- /tuhgi/ → [tó(h).gi] / [tu(h).gém.mu]
This is evidenced by the fact that mid vowels in these positions are still raised to high vowels when stress shift occurs and gemination still occurs. The occurrence of gemination also rules out the possibility that the UR has changed from *tuhgi > /togi/, or that this is merely an exceptional form.
If there was complete phonological deletion of the segment, the mid vowel would not surface, due to the lack of a conditioning coda consonant. However, for forms like tohgi ‘stand up’, even though speakers do not typically produce a laryngeal,9 there is still apparent vowel lowering resulting in a mid vowel. An example of tohgi produced without a laryngeal is shown in Figure 10. This lowering in the absence of an [h] leads to the hypothesis that deletion of the laryngeal here is more phonetic, or at the very least has not been phonologized such that it affects other phonological processes. Deletion of laryngeals has also occurred for native forms which have word-initial /h/, such as hulu /hulu/ surfacing as [úː.lu], suggesting a wider loss of [h] than just within a single word medial environment. It should be noted that this deletion does not apply to glottal stop in the same contexts; coda glottal stop is typically realized, with only occasional deletion when not part of a stressed syllable (Chung 1983; 2020).
The question of import here is whether or not these forms necessarily pose a problem for the licensing account provided in this paper. The evidence found within these forms points to this free variation being trivial given the phonetic account presented in §4. This is because the acoustic cues shown to be present in forms with intervocalic laryngeals – a longer vowel steady state and longer vowel transitions – are also present in forms showing hiatus. Returning to the spectrograms for bohan and seha provided in Figures 8 and 9, vowel formant information is clearly visible, with a duration of vowel steady state similar to that of vowels before intervocalic laryngeals that are acoustically present. With a conservative approach to segmentation – segmenting before it is visually clear that there is formant transition – there is a long period of formant transition from the mid vowel into the following vowel; again, something not provided by forms with intervocalic supralaryngeal consonants. Therefore, I conclude that hiatus, while an interesting problem for the behavior of laryngeals and phonotactic constraints in Chamorro more generally, completely parallels the behavior of intervocalic laryngeals for the licensing of an underlying height contrast.
One alternative approach to handling the vowel height opacity that arises in the free variation between forms such as [seha] and [sea] (from underlying /seha/) is put forth by Kawahara (2002). Variant forms which differ from the output form most faithful to the input (i.e. the base), are regulated via “OV-Correspondence”. OV-Correspondence requires that the variant form be faithful to the base, and this is regulated by faithfulness constraints specified for OV-Correspondence. In one case, Kawahara (2002) uses OV-Correspondence to account for the appearance of variant forms in Japanese which seem to disobey the typical requirement that nasal-obstruent clusters must be voiced throughout (e.g. *NT) (Kawahara 2002). The typical voicing pattern is exemplified in (24), where the initial stop for the verbal suffix is voiceless [t] adjacent to a vowel, but is voiced [d] next to a nasal. However, variants to this regular voicing occur, where opacity due to syncope causes the appearance of voiceless stops adjacent to nasals, as in (25). This is unexpected due to the ban on NT clusters, and thus where the need for some account of these variants comes in.
- ‘with what’
Kawahara accounts for the appearance of variant forms through a ranking of OV-faithfulness above the typical markedness constraint, *NT, which is normally ranked over the IO-faithfulness constraint. The ranking of Ident-OV-[Voi] above both *NT and Ident-IO-[Voi] prioritizes the variant form, the one with an atypical voiceless stop following a nasal, to surface. With a base-variant correspondent relationship, the opaque variant form ([anta]) is preferred to a form which would follow the normal voicing pattern ([anda]), as seen in (26).
Using the approach of OV-correspondence, the opaque hiatus variants of forms like seha and bohan in Chamorro may be the result of a ranking such as OV-ID[high] >> PERIPH >> IO-ID[high]. This ranking will prioritize preserving the height of a vowel from the base to a variant over the requirement of banning mid vowels. This parallels the example pattern in (26), where the same ranking of OV faithfulness above both markedness and IO faithfulness results in the allowance of a typically banned phonotactic pattern.
This results in a base-variant faithfulness preserving underlying mid vowels in both hiatus, and laryngeal environments. The machinery of OV-correspondence is effective for capturing why mid vowels are preserved in the free variants of the exceptional forms in a more purely phonological sense.
5 Other possible approaches
One might reasonably suspect alternative analyses that explain the licensing of vocalic contrast before laryngeals as the result of how Chamorro treats laryngeals phonotactically. Here I will consider two alternative explanations for the appearance of mid and high vowels before laryngeals: (i) laryngeals are syllabified as coda consonants which trigger the regular lowering of high vowels to mid in closed stressed syllables; (ii) laryngeals conditioning vowel lowering separate from the phonological lowering in closed stressed syllables. Both of these alternative explanations will be promptly dismissed, as treating laryngeals as codas intervocalically is inconsistent with facts of syllable structure in Chamorro, and hypothesizing that laryngeals condition lowering in the language is also inconsistent with both the phonetic behavior of laryngeals and the fact that not all forms exhibit a mid vowel before a laryngeal.
5.1 Argument against coda hypothesis
A possible account for the appearance of mid vowels before laryngeals is to assume that these intervocalic laryngeals may be syllabified exceptionally as codas. Recall that there is a regular, predictable pattern of vowel lowering in Chamorro, which I assume is what normally derives mid vowels in closed, stressed syllables in native forms. If intervocalic laryngeals were exceptionally syllabified as codas, then the regular vowel lowering that occurs in closed stressed syllables would apply, thereby deriving the patterned exceptionality observed thus far.
- Hypothetical syllabification as codas
While it is widely assumed that languages seek to maximize onsets and merely tolerate codas, it is not unreasonable to think that some languages may exceptionally violate the principle of maximizing onsets. This has been claimed to be especially true in exactly the prosodic environment explored in this paper, i.e. [V́CV], and claimed explicitly for English (Kahn 1976), among other languages. This principle has been recognized under different names and within many various phonological theories, with the Maximum Onset Principle and the OT constraints Onset (alongside NoCoda) being two of the most well known monikers. There are of course exceptions to the maximization of onsets, especially given a certain prosodic environment, quality of segment, or other phonological factor. This “coda hypothesis” builds off of laryngeals being the target for exceptional syllabification, a not uncommon phenomena typologically.
In her 2007 dissertation, Borroff references several patterns in which intervocalic laryngeals seem to pattern as codas, or as ambisyllabic. These include Balantak (Central Sulawesi), Chickasaw, and Choctaw (Western Muskogean). Another case is Palu’e, a Malayo-Polynesian language spoken in Indonesia on the island of Palu. Palu’e has been described to have closed syllable vowel laxing, meaning that tense vowels occur in open syllables.
- ‘their porridge’
- ‘leg’ (Donohue, p.c. handout (2006) cited by Borroff (2007))
Notably, the lax low vowel occurs before the intervocalic laryngeal in (29-c), suggesting that the laryngeal has been syllabified as a coda resulting in a lax vowel rather than the expected tense vowel. Borroff (2007) notes that this phenomenon also occurs in Tukang Besi, another Malayo-Polynesian language of Indonesia.
A language-specific phonotactic explanation for apparent exceptional syllabification of laryngeal segments doesn’t seem unreasonable. Many languages of the world (including those noted above) have general restrictions on where laryngeals (especially glottal stop) may be found in the word. Restrictions on the distribution of laryngeals are also found in Chamorro; underlying glottal stop only occurs intervocalically and word-finally (Topping & Dungca 1973; Chung 2020). While it should be noted that this pattern does not hold for [h] in Chamorro, it is still support for the possible lexicalization of exceptional syllabification for forms with an intervocalic laryngeal.
This analysis seems tempting given the typological tendencies for laryngeals to pattern separately from supralaryngeal consonants intervocalically, as well as the language-internal facts regarding the distribution of glottal stop. However, this analysis suggests that if intervocalic laryngeals pattern as codas for vowel lowering, they should also pattern as codas for other regular alternations reliant on syllable structure. This is exactly where this coda hypothesis ultimately breaks down. The next sections will show that for two very regular alternations in Chamorro, intervocalic laryngeals do not in fact pattern as codas. I argue that this lack of patterning with other coda-contingent alternations makes an analysis based on exceptional coda assignment untenable.
In Chamorro, gemination can occur with a small set of consonant-initial suffixes: 1st, 2nd, and 3rd person possessive suffixes, as well as the intensifier suffix which attaches to adjectives (Topping & Dungca 1973; Chung 1983; 2020). the gemination targets the initial consonant of the suffix; so the third person suffix -ña [ɲa] geminates as ñña [ɲɲa]. What makes this gemination process particularly interesting is the conditions of its application, (30).
- Gemination conditions
- Must be a first, second, or third person possessive suffix (-hu,10 -mu, -ña, -ta), or the intensifier suffix (-ña)
- The stem must have a closed primary stressed syllable in non-final position, in its bare/isolation/citation form
- The stem must end in an open syllable (Chung 1983)
Gemination only applies to these suffixes when the pre-suffixed stem ends in an open syllable while also having a closed stressed syllable somewhere within the stem, as shown in (31). There is no constraint on where the closed stressed syllable may occur, as long as it corresponds to primary stress in some, smaller (possibly decomposed), form of the word. As shown in (31-a), affixing the first person possessive suffix /-hu/ to the form [gék.pu] yields gemination of the suffix, resulting in [gik.pók.ku]. Gemination occurs because [gekpu] satisfies the two conditions on the stem needed for gemination: (i) the stem must have a closed primary stressed syllable in non-final position, and (ii) the stem must end in an open syllable.
- [gék.pu] → [gik.pók.ku] ‘my flyer’
- [tém.mu] → [tim.móɲ.ɲa] ‘his/her knee’
This gemination process serves as a great diagnostic for closed syllables, and is thus useful for testing the validity of the coda hypothesis.
The prediction made by the exceptional coda assignment hypothesis is that the intervocalic segments in the exceptional forms should function as codas for regular processes in the language. Let us consider a form such as [boʔu]. If the intervocalic laryngeal is syllabified as a coda, in line with the coda hypothesis, then a closed syllable should be present in the stem and gemination should occur. If instead the intervocalic laryngeal is syllabified as an onset, then no gemination is expected as a closed syllable would not occur in the stem. Looking at an example involving [boʔu], affixation of the first person possessive suffix /-hu/ shows that gemination does not occur (32-b). The lack of gemination is especially apparent in this case because no strengthening of [h] occurs. This lack of gemination signals the intervocalic laryngeal is not a coda creating a closed syllable in the stem, and thus provides no support for the coda hypothesis.
- No gemination
- [be.ʔíː.ɲa] ‘his/her bandage’
- [bo.ʔúː.hu] ‘my bubble’
It should be noted that the sequence [.ʔVC.] is permitted in Chamorro generally, as exemplified by a form like ma’aksum [ma.ʔák.sum] ‘sour’. However, there is no gemination in these cases, evidenced by a lack of lowering of the previously word-final high vowel. Lowering should occur in this case, as a closed stressed syllable would be produced by gemination. Thus, gemination serves as the first solid evidence against an exceptional coda assignment hypothesis.
5.1.2 Penultimate vowel lengthening
The second piece of evidence for dismissing an exceptional coda assignment hypothesis lies in the regular process of penultimate vowel lengthening in Chamorro. When a vowel occurs in an open, stressed, penultimate syllable the vowel is lengthened, seen in (33). This can possibly be attributed to a simple stress-to-weight principle,11 which also explains the tendency for closed syllables to be stressed. Notably, vowels are only lengthened in open penultimate syllables which receive primary stress.
- Penultimate vowel lengthening in open stressed syllables
Conveniently, this can be used as a reliable diagnostic for whether a penultimate syllable is open or closed in the language. This will help determine whether intervocalic laryngeals can exceptionally be codas in Chamorro. Durations in Figure 11 show a significant difference in the duration of stressed vowels in open versus closed syllables, previously observed by Chung (1983) and Crosswhite (1998).
It’s therefore expected that any vowel lengthened due to penultimate vowel lengthening should match (or exceed) the durations for the stressed vowels found in forms like [díː.suʔ] and [túː.giʔ]. Conversely, if intervocalic laryngeals were to function as codas, a lack of lengthening must be shown, meaning the durations for the mid vowels found in forms such as bohan and te’uk in (34) should be comparable to the vowels’ durations in closed stressed syllables with a following supralaryngeal.
- Penultimate vowel lengthening in open stressed syllables with following laryngeal
- ‘white spot on eye’
- ‘shoo; back-up’
The durations in Figure 12 show that mid vowels before intervocalic laryngeals have comparable duration compared to other lengthened vowels found in open, stressed, penultimate syllables.12 As penultimate lengthening is clearly applying, there is no way to confidently assert that the intervocalic laryngeals seen in (34) could be reanalyzed as codas.
Both gemination and penultimate vowel lengthening serve as strong evidence against an exceptional coda assignment hypothesis. In addition to this phonological evidence, it is useful to consider what it would mean if laryngeals patterned as codas more broadly. If laryngeals were exceptional codas, then the puzzle gets shifted to why high vowels can occur before intervocalic laryngeals.
- High vowels before intervocalic laryngeals
- /liʔuf/ → [líː.ʔuf]
- /tuhuŋ → [túː.huŋ]
- /siʔuk/ → [síː.ʔuk]
- /hihut/ → [híː.hut]
- /suʔun/ → [súː.ʔun]
- /guʔut/ → [gúː.ʔut]
In (35) high vowels are shown to regularly occur before intervocalic laryngeal consonants. Considering the hypothetical that intervocalic laryngeals were syllabified as codas, it becomes a mystery as to why high vowels would be found before intervocalic laryngeals, as laryngeal codas would provide a closed stressed syllable for the regular lowering process in Chamorro. The appearance of high vowels before intervocalic laryngeals, in addition to the facts regarding gemination and penultimate lengthening, proves the exceptional coda hypothesis untenable.
5.2 Do laryngeals in Chamorro condition lowering?
Given the patterned exceptionality discussed here involves laryngeal sounds, one might hypothesize that the laryngeal segments have actually conditioned the lowering of high to mid vowels in Chamorro. Notably this is a different hypothesis than the one discussed in §5.1. Many languages appear to have phonological lowering processes that seem to be conditioned by surrounding glottal consonants (Bessell 1992; Rose 1996; Sylak-Glassman 2014). An articulatory explanation for this pattern may be that some laryngeal consonants are accompanied by larynx raising (Esling et al. 2019), which shortens the back cavity and correspondingly raises F1. However, not all laryngeal consonants always have associated larynx raising: [h], or the corresponding “breathy” voice quality, typically has associated larynx lowering (Esling et al. 2019), but at the same time has been observed with larynx raising. It is also not a given that laryngeal consonants are always associated with low vowels more than with non-low vowels, as some previous literature may suggest (Brunner & Żygis 2011). For example, coda /t/ glottalization in American English is found to be less common when the vowel is high, but crucially not more common when the vowel is low (Seyfarth & Garellek 2020). One may ask then if laryngeal consonants in Chamorro have conditioned a phonologized lowering of preceding high vowels to mid. I find several issues with this hypothesis, ultimately leading to its dismissal.
First, there is a good amount of evidence and reasoning behind why glottal stop and glottalization (creakiness) may lower vowels; however, the evidence is less straightforward to support the claim that [h] or breathiness provides the same effect. To produce [h] and other breathy sounds, the glottis is held almost entirely open, with the vocal folds loose (Edmondson & Esling 2006). Because of the articulator’s position, tautness, and effect on surrounding muscle and tissue, glottal/creaky sounds are not comparable in their effects on surrounding sounds when compared to breathy sounds (Edmondson & Esling 2006). Also as previously mentioned, [h] is associated with lowering of the larynx rather than raising of the larynx, though both appear possible.
A second more direct argument is that if lowering does occur due to the presence of a laryngeal, it becomes a question of why mid vowels are not found before all laryngeals, intervocalic or otherwise. If glottal stop conditions lowering, rather than simply providing an environment to realize some underlying form faithfully, it should be the case that all (or at least most) forms with vowel preceding a laryngeal lower an underlying preceding high vowel. This should be especially true for the case of word-final glottal stop. Most vowels in word final syllables are unstressed in Chamorro, which should provide the ideal environment for lowering. This is characteristic of patterns of closed syllable vowel laxing, where tense-lax contrasts are only allowed in stressed syllables (Storme 2017). Closed syllable lowering is also found in Austronesian generally (e.g. Javanese, Indonesian, Ngaju Dayak). However, there is no evidence of increased mid vowel occurrence before word-final glottal stops in Chamorro, as would be expected if laryngeal conditioned vowel lowering was the cause. This finding patterns with showing that coda /t/ glottalization in American English is less common when the vowel is high, but crucially not more common when the vowel is low. Furthermore, there are plenty of native forms which show a high vowel before intervocalic glottal stop, and even more before intervocalic [h].
Although no true minimal pairs appear to exist for these forms in Chamorro, near minimal pairs do exist: e.g. [sóːʔon] and [súːʔun]. Of the 77 native bisyllabic roots with a high vowel before an intervocalic laryngeal, 48 have a high vowel before [h], with the remainder having a high vowel before glottal stop. For reference, all known bisyllabic native roots with an intervocalic laryngeal are provided in the appendix.
Given the distribution of high vowels before laryngeals, it seems inconsistent at best to claim that the laryngeals in Chamorro regularly condition vowel lowering.
6 Remaining issues
6.1 Mid vowels before supralaryngeal consonants
One important issue that this paper has not yet addressed is the existence of native forms in Chamorro which have a mid vowel outside of a closed stressed syllable, but crucially not preceding a laryngeal consonant.
- ‘star-shaped rice-cake’
These forms are potentially problematic for the account, as mid vowels have only been shown to occur outside of the typical environment when they are licensed by the presence of enough distinct perceptual cues. This licensing has only been shown to occur when vowels occur before intervocalic laryngeals; therefore, probable reasons for why these forms occur must be considered.
A hypothesis that can be immediately dismissed is that these forms are licensed for the same reasons as mid vowels before laryngeals. The intervocalic supralaryngeal consonants in these forms do not permit the same persistence of formants, and the environment does not appear to provide sufficient cues to preserve a distinct contrast in this position. In this way, mid vowels before supralaryngeal consonants are not licensed by phonetic or perceptual factors. These exceptions should also not be attributed to the exceptional coda hypothesis presented in §5.1, as the intervocalic supralaryngeal consonant in these forms does not provide the right closed stressed syllable environment for gemination, and they also appear to have the same duration (seen in Figures 13 & 14) as other penultimate lengthened syllables.
A possible explanation is that these forms were historically forms with coda consonants, e.g. */gusfis/ and */puktu/. The mid vowels would be derived simply from the typical lowering of high vowels to mid in closed stressed syllables, and then an idiosyncratic loss of the coda consonant resulted in the synchronic forms in (37). This explanation is difficult to prove, as no research on the historical development of Chamorro mentions the loss of coda consonants (Blust 2000) and there is no evidence of these forms maintaining a coda consonant for other regular processes relying on codas, i.e. gemination and penultimate lengthening. So while this hypothesis is tenable, it is difficult to falsify.
One logical explanation for these forms is that they are merely loan words which are part of a non-native stratum which permits faithfulness to vowels in stressed positions, as suggested for other loans in §2.1. However, it is fairly clear that many of these forms are not loan words from Spanish, Japanese, or English, which are the three primary languages that words are borrowed from into Chamorro. In addition, there is no indication that these words stem from another Austronesian language. Some loans in Chamorro are taken from Tagalog, or can be related to a cognate from the Austronesian family; however, a search through the Austronesian comparative dictionary does not reveal any convincing results that forms like (37) are borrowed from another Austronesian language (Blust & Trussel 2010).
A hypothesis worth considering is that these words simply exist in a separate stratum from other native Chamorro forms. This stratum would be composed of forms that don’t quite conform to typical native phonatactics, and might include loan forms that also demonstrate this behavior. An analysis of strata of this has been suggested by Chung (1983), and has also been shown quite convincingly for Japanese (Ito & Mester 2008). To help contextualize the frequency of these forms, they make up roughly 17% of bisyllabic native roots (84/501), and are shown to have a lower observed than expected frequency (see Table 1 and Table 2). However, a difference in frequency alone does not provide an adequate explanation for why these forms in particular would be situated in a stratum separate from other native Chamorro forms. There is unfortunately no evidence to suggest that these forms are loans from another language. These forms also don’t appear to have some common phonotactic reason for the appearance of mid vowels. Stated another way, there is nothing patterned about the phonotactic environment that suggests some explanatory trend for the appearance of mid vowels. I choose not to dismiss the strata hypothesis outright, but the lack of some patterning among these forms offers little support for some synchronic or diachronic explanation to the use of a separate strata for these forms.
A very interesting, but equally murky, question is where the mid vowels in these seemingly native forms come from. This question is different from how these forms should be accounted for in the grammar, as they are clearly lexically conditioned with no sufficient phonetic or phonological explanation as already shown. The origin of these mid vowels is more about the diachronic path of vowels in Chamorro, i.e. when did mid vowels become either introduced into, or created in, Chamorro. Because the occurrence of mid vowels is an issue of distribution rather than alternation, the question of where mid vowels come from is connected to the issue of their appearance synchronically.
Ultimately, nothing definitive can be said regarding the true origin of mid vowels in native forms in Chamorro. One theory put forth by Topping & Dungca (1973), and later adopted by Blust (2000), is that Chamorro started with a three vowel system underlying /i, u, a/.13 Mid vowels would be derived through a process of lowering, causing complementary distribution for native forms. Through the introduction of loan words, primarily through Spanish colonization beginning in the late 1600s, mid vowels eventually came to be treated as phonemic (Blust 2000). It seems the change could have been a merger of stratum, similar to other cases of nativization, where eventually some native forms were lexicalized with an underlying mid vowel that is now visible in the synchronic phonology.
Alternatively, one could imagine that some proto-Chamorro language had phonemic mid vowels that eventually went through another merger, or instead became neutralized on the surface and the mid vowels were still dormant in the phonology. With the introduction of loans, or some proper licensing environment such as stressed open syllables before intervocalic laryngeals, the mid vowels resurfaced and became lexicalized once more. This potential diachronic path is, at this point, indistinguishable from the one mentioned by Topping & Dungca (1973) and Blust (2000), but is still an important point to consider as the mid vowels must, for the licensing account of the facts presented here, be underlying in some previous iteration of the language.
This account makes some specific predictions regarding the shape of vowel inventories cross-linguistically. First, this account predicts that long vowels should license more height contrasts. This is because long vowels will ultimately provide more sufficient phonetic cues for perceptibility of contrast, thus resulting in more distinct vowel quality contrasts within a set of long vowels. However, short vowels should not be able to provide the same distinct phonetic cues for contrast perception, as the duration of vowel steady state and transitional formant information will generally be shorter, which will result in a smaller distribution of vowel qualities in the set of short(er) vowels. An example of a system just like this is provided by Maddieson 1984; Nenets (formerly Yuraks), which has 5 short vowels /i, e, a, o, u/ and 3 “overshort” vowels /ĭ, ă, ŭ/, is exactly the system that is predicted if longer vowels, which provide better cues for contrast perception, license more distinct contrasts in vowel inventories. A similar system to Nenets is Telefol,14 containing 5 long vowels, /iː, ɛː, aː, ɔː, uː/ and 3 short vowels, /i, a, u/ (Maddieson 1984), again emphasizing the presence of inventories which are readily predicted by the licensing account formalized here.
A system opposite to that of Nenets should also be considered; namely, a system which has more short vowels than long vowels. Systems of this type can be seen in languages like Atayal, which has 5 short vowels /i, ɛ, a, ɔ, u/ and 2 long vowels /iː, uː/, as well as Chipewyan (3 long /iː, aː, uː/, 5 short /i, e, a, o, u/) (Maddieson 1984). In terms of the account provided in §4, this type of system with more short vowels than long is not very well predicted. This is because longer vowel steady states and transitional information should serve as better cues for contrast, meaning that long vowels should be more prevalent as underlying contrastive sounds. The licensing account presented here is in some ways committed to the typological prediction that systems with more long vowels than short should serve as a default. Atayal and Chipewyan would be apparent exceptions to this default. Exceptions to the default of more contrastive long vowels than short vowels could arise for several factors, both language specific and more general. Long vowels have been claimed to be more marked than short vowels cross-linguistically (Greenberg 1966), a markedness pattern that may work against the predictions laid out in this paper in the derivation of inventories. More specifically, Maddieson has claimed that vowels of a certain quality are more likely to appear as long in languages. From Maddieson’s survey of languages, there is a greater likelihood for tense mid vowels to be longer than high vowels, and for high vowels to be longer than mid lax vowels (Maddieson 1984). This may skew inventories towards less long vowels depending on the layout of features. There is also the tendency for mid vowels to be marked cross-linguistically (Beckman 1997), a pressure that could narrow the potential set of contrastive long vowels. So Atayal and Chipewyan may serve as tentative exceptions to the prediction of more contrastive long vowels than short, but these could be explainable cases, either through the interaction of other typological pressures, or even language specific pressures on the composition of the vowel inventory.
6.3 Phonetics in the phonology: synchronic versus diachronic approaches
This paper’s formal account builds some reference to phonetic information into the phonology. Specifically the phonetic information that is referenced is perceptual information regarding vowel contrast. This integration of phonetics into the phonology should be differentiated from approaches that have a strict separation of phonetics and phonology (Hale & Reiss 2000), and those approaches that treat phonetics and phonology as one and the same system (Ohala 1990). The account in this paper sits somewhere in the middle of these extremes, motivating that some reference to the perception of acoustic information is necessary in the phonological grammar’s regulation of the distribution of mid and high vowels. Chamorro presents an interesting case study to explore this issue of integration, as not every part of the distribution of vowels is phonetically informed (e.g. regular vowel lowering in closed stressed syllables), but accounting for mid vowels before intervocalic laryngeal consonants without acknowledging the role of the phonetic environment would lack explanatory depth.
Throughout this paper, there has been the assumption that the licensing of a mid-high contrast before laryngeals is a part of the synchronic phonology of Chamorro. By motivating this account through the use of dispersion theory, there is a commitment to the idea that there should be sufficient enough distinct cues for vowel contrast. This licensing of sufficient cues could result from increased duration, such as that of final-lengthening, or other phonetically transparent sounds adjacent to the vowel allowing formants to persist, as observed with laryngeal consonants. Empirically, only intervocalic laryngeals immediately following a stressed syllable seem to provide a sufficient phonetic environment for the licensing of vowel contrast in Chamorro. As a result, the synchronic account presented here is consistent with the Chamorro data as shown.
In positing that the phonetics synchronically informs the phonology, this account takes somewhat of an opposing stance to theories proposing a strict separation of phonetics and phonology. One possible approach would be an account in which the more frequent occurrence of mid vowels before laryngeals is an accidental property of the lexicon which is not integrated into the grammar. This approach relies heavily on idiosyncratic specification, and does not provide any explanatory depth. Another example of separation between phonetics and phonology would be the phonologization approach in Barnes (2006). This approach suggests that the phonetics and the phonology are separate in the synchronic grammar. Phonetics may influence the phonology diachronically, through the reinterpretation of phonetic regularities that may or may not still be present in the language. For example, a phonetic environment could have conditioned some merger or preservation of contrast, but ultimately this pattern becomes concretized in the phonology, leaving behind the necessity for the phonetic environment to be present for the pattern to occur. This theory of phonologization is more dependent on a strong understanding of the diachronic pattern of Chamorro and the phonetic precursors behind the mid/high distribution. Given this, one might wonder whether there is a story about the patterned exceptionality of mid vowels before laryngeals in Chamorro that does not require reference to phonetic cues in the synchronic phonology. Here I will lay out two potential approaches that I believe are reasonable, but are ultimately not sufficient or convincing enough to trump the synchronic approach I’ve laid out.
One approach that would resemble a phonologization approach would assume that the mid vowels in Chamorro were conditioned by loan words being introduced into the language. This approach was encountered earlier in the paper, where mid vowels arose in native Chamorro forms by extension from mid vowels in loaned forms. However, regardless of whether the mid vowels currently in the underlying Chamorro inventory were a result of loan word extension, this would not explain the clear patterned exceptionality found with mid vowels in stressed syllables before intervocalic laryngeals. The overwhelming majority of loans into Chamorro come from Spanish, which does not have native intervocalic glottal stop, and few instances of mid vowels before intervocalic [h], at least in the distribution of loans found in Chamorro. The extremely low amount of forms with a mid vowel before a laryngeal in loan word forms does not provide sufficient enough evidence to suggest any phonotactic or phonetic precursors that would lead to the pattern of having more mid vowels before laryngeals than expected. This first potential approach to providing a more diachronic explanation for the patterned exceptionality is insufficient and significantly less convincing than the current synchronic approach.
The other phonologization-based approach would be exactly of the kind laid out in §5.2, where the laryngeals in Chamorro have conditioned vowel lowering. However, it was already determined that while there are articulatory reasons for glottal stop to lower vowels, due to vocal fold closure causing constriction at the pharynx (Edmondson & Esling 2006), this does not seem to occur in Chamorro. The evidence can be seen in the acoustics, where if there were any pharyngeal constriction resulting in a raising of F1, it should be apparent from the spectrograms. Instead, a smooth transition from nucleus to nucleus (vowel to vowel) is found, with seemingly no raising in F1 which would suggest some epilaryngeal constriction. This is again due to the fact that glottal stops do not condition F1 changes. Furthermore, [h] and breathy segments, at least articulatorily, do not cause the same pharyngeal constriction as glottal stop or creaky segments; in fact, it is exactly the opposite as described in fuller detail in §5.2 and by Edmondson & Esling (2006). Therefore, articulatory lowering would not help explain why 12 out of 29 exceptional forms have a mid vowel before [h], which crucially does not induce articulatory lowering. So while it is clear that a phonologization approach is beneficial to explaining opaque distributions in languages, if there is no clear diachronic pathway or necessary phonetic precursors to explain the distribution, reasonable approaches seem to fail. In contrast, the acoustic evidence presented in §4 points strongly to the use of perceptual cue information to determine licensing environments. Because this phonetic information is transparent and seemingly accessible to speakers, it lends itself strongly to the integration of these perceptual factors in the synchronic phonology.
All of this discussion warrants some positive proposal to how this distribution (mid vowels before laryngeals) came about in the first place. The hypothesis I believe to be the most tenable is that Chamorro began as a three vowel inventory consisting of the typical peripheral vowels, /i, u, a/ (Topping & Dungca 1973; Blust 2000). Due to the influence of Spanish, mid vowels were introduced into the language, but not in the environment of intervocalic laryngeals, due to a lack of loans demonstrating this pattern. Supposing that glottal stop can have the effect of both articulatory (Edmondson & Esling 2006), and potentially perceptual (Kingston et al. 1997; Brunner & Żygis 2011) lowering, some bias towards lower vowel perception for the high vowels in the original inventory is expected. With the introduction of mid vowels from Spanish, this bias then was allowed to emerge as the appearance of mid vowels in the environment of glottal stop. This pattern then gets extended to /h/, generalizing to the other laryngeal in the inventory.
As has been argued throughout this paper, phonetic details are able to influence the phonology of vowel systems. Chamorro provides supportive evidence in this regard, with a phonetic environment provided by intervocalic laryngeals (following a stressed syllable) enabling the perception of distinct phonetic cues to interpret an otherwise neutralized underlying vowel contrast. The distinct perceptible cues consist of increased vowel steady state and formant transition information which assist in the accurate perception of F1, the primary cue to vowel height.
To represent the licensing of contrast that this phonetic environment provides, a formal account using dispersion theoretic constraints was proposed. This account used constraints for the maximization of contrasts (NoMerge), the maximization of the distinctiveness of contrast (MinDist), and the preference for peripheral vowels (Peripheral), which are also conveniently the ones which maximize the dispersion of an inventory. These constraints evaluated potential positional vowel inventories, and the potential output vowel inventories could be affected by the phonetic environment they would appear in, such that the perceptual distance for vowels in the environment could be scaled based on the phonetic context. This enabled the account to effectively capture that intervocalic laryngeals were the source of a more distinct perceptual distance between the formants of the contrasting vowels. The phonetic based account just alluded to was adopted over an account based on syllable structure, as it was shown that interpreting intervocalic laryngeals as codas did not coincide with other regular, prosodically conditioned phenomena in the language. The free variation of an intervocalic laryngeal and hiatus appears to still be captured in the licensing account presented here, but also raises the interesting question of whether the production of hiatus is due to phonological or phonetic factors.
There are also points throughout this paper that are possibly open to debate. The occurrence of mid vowels before intervocalic supralaryngeal consonants can neither be explained by the licensing account presented here, nor by the dismissed exceptional coda hypothesis presented. Native forms with mid vowels before intervocalic supralaryngeal consonants may in fact be true exceptions in the language, rather than the result of the patterned exceptionality of forms with intervocalic laryngeals accounted for here. Furthermore, the integration of phonetics into the synchronic phonology of Chamorro makes strong predictions regarding typology and the potential for other environments within the language to realize some underlying contrast given a sufficient phonetic environment. Whether these predictions are truly borne out crosslinguistically is a question that must be answered following a more exhaustive investigation of vowel inventories of many different languages.
Extending the work done in this paper will require much more work, both within Chamorro and outside of the language. It is clear that the predictions made by the account must be tested more thoroughly to determine whether such a direct integration of phonetics and phonology is required, but as of now this account succeeds in motivating that phonetic cues play a significant role in the phonology of Chamorro.
Appendix 1: Forms with intervocalic laryngeals
|be’e||bandage||be’i||to put bandage on|
|dihao||careless||dihao||shortside at cock pit|
|do’ak||cataract||do’an||carry on shoulder (with stick)|
|fiha||buy on credit||fihu||often|
|fo’i||leader (fishing line)||fuhut||bind lightly|
|geha||fan (from coconut leaves)||goha||fan|
|nuhung||shady||nuhut||stem of coconut leaf|
|nehum||lurk||peha||hot or cold compress|
|puha||capsize||puhut||press into a ball|
|seha||back up||si’i||weaving tool|
|si’ing||crowd in||siha||they (3.SG)|
|to’a||mature||to’i||put liquid into|
|tuhus||have too much fun||Yu’us||god|
|yuhi||that||oha||solid covering for doorway|
RED = reduplicative morpheme, IO = input-output, OV = output-variant, [voi] = voice
- The word for ‘bandage’ is also variably produced with two mid vowels, [béːʔe]. The native forms with two mid vowels outside of closed stressed syllables are few indeed: [bóːʔok] ‘uproot’, [móːhon] ‘wish/desire’, [sóːʔon] ~ [sóːhon] ‘stagger’. Only native forms containing an intervocalic laryngeal participate in this pattern. As these forms are particularly infrequent, they will not be discussed in depth in this paper. However, one could posit that the realization of two mid vowels of the same quality has been conditioned by the phonetic transparency of laryngeal consonants, discussed in Borroff (2007), which results in vowel harmony or harmony-like processes. [^]
- The persistence of vocalic formant structure into and through the laryngeal, which then serves as an external cue to vowel quality contrast, is not predicted to be be exclusive to F1. For example, one might suspect laryngeals to license more backness contrasts for as the result of the persistence of F2 through the laryngeal. This is reasonable, but the need to explain the licensing of the mid-high contrast before laryngeals in Chamorro is because of regular neutralization of height contrast among a large set of the vowels in the language. Backness neutralization in Chamorro roots only really exists for low vowels, where the low vowel is then subject to the backness of surrounding consonants (Chung 2020). There is no phonetic reason to expect greater licensing is impossible for other contrasts, so I leave the investigation of potential backness licensing for future work. [^]
- One may observe a bump in F1 during the [h] gesture in the Figure 2 spectrogram. This seeming raise in F1 is actually attributable to an inaccuracy in the Praat formant tracking, where F1 and F2 are conflated. [^]
- This form is typically attested with an intervocalic glottal stop, but for this speaker (MAC) is variably produced with intervocalic [h]. [^]
- Also notable is the durational difference between vowels before intervocalic supralaryngeals and vowels before intervocalic laryngeals. A t-test reveals that vowel duration before laryngeals is greater than that of duration before supralaryngeal contexts (t = 2.7998, df = 42.827, p-value = 0.007638; supralaryngeal context mean = 101.4ms, laryngeal context mean = 120.4). Token counts are somewhat low for these comparisons (25 tokens for supralaryngeals, 50 for laryngeals), so this difference should be taken as suggestive. This suggestive durational difference does point to another potential difference in laryngeal and supralaryngeal environments which might explain the facilitation of better contrast perception in laryngeals. I leave the exploration of this duration difference for future work. [^]
- While the “less than or equal to” aspect of this constraint is not typical in the MinDist constraints formulated by Flemming (1995), it works well conceptually for representing how the supralaryngeal consonant environment and laryngeal consonant environment modulate the perceptual distance between F1 levels. [^]
- One might wonder what prevents neutralization to [a] or [ɑ] instead of the corresponding high vowel. The alternation from high to mid (and vice versa) is an already existing pattern in the phonology. One proposal is to assume the neutralization here defaults to the already existing, independently motivated, mid-high alternation, similar to freeride learning (McCarthy 2005). [^]
- I thank an anonymous reviewer for the suggestion that the lack of mid vowels before word-final glottal stop is due to categorical reduction in unstressed syllables, a la Crosswhite (2001). [^]
- Chung (p.c) points out that some speakers of the Guam dialect and the Rota dialect retain [h] in these positions, though my consultant who speaks a Guam dialect does not have any production of [h] in coda position. [^]
- Geminate [h] is not permissible in Chamorro, and instead strengthens to [kk]. [^]
- An anonymous reviewer suggests that stress placement could instead be characterized as conditioned by weight. While intriguing and potentially tenable, I leave this for future work on Chamorro stress. [^]
- As previously stated in a footnote in §3, vowel duration before intervocalic laryngeals actually seems to exceed that of the duration of vowels before supralaryngeals in open stressed syllables. Vowel duration was determined with respect to the segmentation criteria outlined in §3. This may have an effect on the fact that vowels before laryngeals appear to have a longer duration than vowels before supralaryngeals; however, what is crucial is that they are at least comparable. [^]
- Blust (2000) also mentions the reconstructed inventory of proto-Austronesian containing a schwa-like mid vowel, that became /u/ in Chamorro, *e > u. [^]
- Nenets and Telefol are unrelated to each other, being Uralic and Trans-New Guinean languages respectively. Neither Nenets or Telefol are related to Chamorro. [^]
Thank you to Ryan Bennett, who’s encouragement, insight, and patience helped this work come together. Thanks are also due to Jérémie Beauchamp, Sandy Chung, Ben Eischens, Junko Ito, Amanda Rysling, participants of UCSC’s Phlunch, and participants of PHREND 2019. Lastly, I thank the reviewers for their helpful comments.
The author has no competing interests to declare.
Barnes, Jonathan. 2006. Strength and weakness at the interface, positional neutralization in phonetics and phonology. Berlin, Boston: De Gruyter Mouton. https://www.degruyter.com/view/product/178714 (6 April, 2019). DOI: http://doi.org/10.1515/9783110197617
Beckman, Jill. 2004. Positional faithfulness. In McCarthy, John J. (ed.), Optimality Theory in Phonology, 310–342. Oxford, UK: Blackwell Publishing Ltd. http://doi.wiley.com/10.1002/9780470756171.ch16 (17 May, 2019). DOI: http://doi.org/10.1002/9780470756171.ch16
Beckman, Jill N. 1997. Positional faithfulness, positional neutralisation and Shona vowel harmony. Phonology 14(1). 1–46. JSTOR: 4420090. DOI: http://doi.org/10.1017/S0952675797003308
Bessell, Nicola. 1992. The typological status of /?, h/. Proceedings of the 28th Meeting of the Chicago Linguistic Society.
Blust, Robert. 2000. Chamorro historical phonology. Oceanic Linguistics 39(1). 83–122. JSTOR: 3623218. DOI: http://doi.org/10.2307/3623218
Blust, Robert & Trussel, Stephen. 2010. Austronesian comparative dictionary, web edition. http://www.trussel2.com/acd/.
Boersma, Paul & Weenink, David. 2022. Praat: Doing phonetics by computer. Version 6.2.08. http://www.praat.org/.
Borroff, Marianne L. 2007. A landmark underspecification account of the patterning of glottal stop. Stony Brook University PhD Thesis.
Brunner, Jana & Żygis, Marzena. 2011. Why do glottal stops and low vowels like each other? ICPhS XVII, Hong Kong. 4.
Chung, Sandra. 1983. Transderivational relationships in Chamorro phonology. Language 59(1). 35–66. JSTOR: 414060. DOI: http://doi.org/10.2307/414060
Crosswhite, Katherine. 1998. Segmental vs. prosodic correspondence in Chamorro. Phonology 15(3). 281–316. http://www.journals.cambridge.org/abstract_S0952675799003619 (22 January, 2019). DOI: http://doi.org/10.1017/S0952675799003619
Crosswhite, Katherine. 2001. Vowel reduction in optimality theory. New York: Routledge.
Edmondson, Jerold A. & Esling, John H. 2006. The valves of the throat and their functioning in tone, vocal register and stress: Laryngoscopic case studies. Phonology 23(2). 157–191. http://www.journals.cambridge.org/abstract_S095267570600087X (28 October, 2019). DOI: http://doi.org/10.1017/S095267570600087X
Esling, John H. & Moisik, Scott R. & Benner, Allison & Crevier-Buchman, Lise. 2019. Voice quality: The laryngeal articulator model. 1st edn. Cambridge University Press. https://www.cambridge.org/core/product/identifier/9781108696555/type/book (22 February, 2022). DOI: http://doi.org/10.1017/9781108696555
Flemming, Edward S. 1995. Auditory Representations in Phonology. Los Angeles: University of California.
Gouskova, Maria. 2012. Unexceptional segments. Natural Language & Linguistic Theory 30(1). 79–133. http://link.springer.com/10.1007/s11049-011-9142-4 (6 May, 2021). DOI: http://doi.org/10.1007/s11049-011-9142-4
Greenberg, Joseph H. 1966. Language universals, with special reference to feature hierarchies. The Hague: Mouton.
Hale, Mark & Reiss, Charles. 2000. “substance abuse” and “dysfunctionalism”: Current trends in phonology. Linguistic Inquiry 31(1). 157–169. JSTOR: 4179099. DOI: http://doi.org/10.1162/002438900554334
Holt, Lori L. & Lotto, Andrew J. & Kluender, Keith R. 2000. Neighboring spectral content influences vowel identification. The Journal of the Acoustical Society of America 108(2). 710–722. http://asa.scitation.org/doi/10.1121/1.429604 (3 December, 2020). DOI: http://doi.org/10.1121/1.429604
Ito, Junko & Mester, Armin. 1995. The core-periphery structure of the lexicon and constraints on reranking. University of Massachusetts Occasional Papers in Linguistics 18. 181–209.
Ito, Junko & Mester, Armin. 2008. Lexical classes in phonology. Oxford University Press. http://oxfordhandbooks.com/view/10.1093/oxfordhb/9780195307344.001.0001/oxfordhb-9780195307344-e-004 (31 May, 2019). DOI: http://doi.org/10.1093/oxfordhb/9780195307344.013.0004
Johnson, Keith & Sherman, V. Clayton & Sherman, Stephanie G. 2011. Acoustic and auditory phonetics. Hoboken, United Kingdom: John Wiley & Sons, Incorporated. http://ebookcentral.proquest.com/lib/ucsc/detail.action?docID=698133 (27 February, 2022).
Kahn, Daniel. 1976. Syllable-based generalizations in English phonology. Massachusetts Institute of Technology Thesis. http://dspace.mit.edu/handle/1721.1/16397 (6 March, 2019).
Kawahara, Shigeto. 2002. Similarity among varients: Output-Variant correspondence. Tokyo, Japan: International Christian University.
Keating, Patricia. 1988. Underspecification in phonetics. Phonology 5. 275–292. DOI: http://doi.org/10.1017/S095267570000230X
Kingston, John & Macmillan, Neil A. & Dickey, Laura Walsh & Thorburn, Rachel & Bartels, Christine. 1997. Integrality in the perception of tongue root position and voice quality in vowels. The Journal of the Acoustical Society of America 101(3). 1696–1709. http://asa.scitation.org/doi/10.1121/1.418179 (8 June, 2020). DOI: http://doi.org/10.1121/1.418179
Maddieson, Ian. 1984. Patterns of sounds. Cambridge: Cambridge University Press. 422 pp. DOI: http://doi.org/10.1017/CBO9780511753459
McCarthy, John J. 1994. The phonetics and phonology of Semitic pharyngeals. In Keating, Patricia A. (ed.), Phonological Structure and Phonetic Form, 191–233. Cambridge: Cambridge University Press. https://www.cambridge.org/core/product/identifier/CBO9780511659461A021/type/book_part (8 October, 2020). DOI: http://doi.org/10.1017/CBO9780511659461.012
McCarthy, John J. 2005. Taking a free ride in morphophonemic learning. Catalan Journal of Linguistics 4(1). 19. http://revistes.uab.cat/catJL/article/view/v4-mccarthy (9 December, 2020). DOI: http://doi.org/10.5565/rev/catjl.112
Miyawaki, Kuniko & Jenkins, James J. & Strange, Winifred & Liberman, Alvin M. & Verbrugge, Robert & Fujimura, Osamu. 1975. An effect of linguistic experience: The discrimination of [r] and [l] by native speakers of Japanese and English. Perception & Psychophysics 18(5). 331–340. http://link.springer.com/10.3758/BF03211209 (27 February, 2022). DOI: http://doi.org/10.3758/BF03211209
O’Connor, J. D. & Gerstman, L. J. & Liberman, A. M. & Delattre, P. C. & Cooper, F. S. 1957. Acoustic cues for the perception of initial /w, j, r, l/ in English. WORD 13(1). 24–43. http://www.tandfonline.com/doi/full/10.1080/00437956.1957.11659626 (27 February, 2022). DOI: http://doi.org/10.1080/00437956.1957.11659626
Ohala, John J. 1990. There is no interface between phonology and phonetics: A personal view. Journal of Phonetics 18(2). 153–171. https://linkinghub.elsevier.com/retrieve/pii/S0095447019303997 (5 July, 2022). DOI: http://doi.org/10.1016/S0095-4470(19)30399-7
Ohde, Ralph N. & German, Sarah R. 2011. Formant onsets and formant transitions as developmental cues to vowel perception. The Journal of the Acoustical Society of America 130(3). 1628–1642. pmid: 21895100. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3188975/ (6 May, 2019). DOI: http://doi.org/10.1121/1.3596461
Padgett, Jaye. 2003. Contrast and post-velar fronting in Russian. Natural Language & Linguistic Theory 21(1). 39–87. JSTOR: 4048035. DOI: http://doi.org/10.1023/A:1021879906505
Rose, Sharon. 1996. Variable laryngeals and vowel lowering. Phonology 13(1). 73–117. https://www.cambridge.org/core/product/identifier/S0952675700000191/type/journal_article (7 June, 2020). DOI: http://doi.org/10.1017/S0952675700000191
Rysling, Amanda & Jesse, Alexandra & Kingston, John. 2019. Regressive spectral assimilation bias in speech perception. Attention, Perception, & Psychophysics 81(4). 1127–1146. https://doi.org/10.3758/s13414-019-01720-9 (29 October, 2020). DOI: http://doi.org/10.3758/s13414-019-01720-9
Seyfarth, Scott & Garellek, Marc. 2020. Physical and phonological causes of coda /t/ glottalization in the mainstream American English of central Ohio. Laboratory Phonology 11(1). https://www.journal-labphon.org/article/id/6282/ (11 July, 2022). DOI: http://doi.org/10.5334/labphon.213
Steriade, Donca. 1997. Phonetics in phonology: The case of laryngeal neutralization. 111.
Steriade, Donca. 2008. The phonology of perceptibility effects: The P-Map and its consequences for constraint organization. In Hanson, Kristin & Inkelas, Sharon (eds.), The Nature of theWord, 150–178. The MIT Press. http://mitpress.universitypressscholarship.com/view/10.7551/mitpress/9780262083799.001.0001/upso-9780262083799-chapter-7 (6 April, 2019). DOI: http://doi.org/10.7551/mitpress/9780262083799.003.0007
Stevens, Kenneth N. 1989. On the quantal nature of speech. Journal of Phonetics 17(1). 3–45. https://www.sciencedirect.com/science/article/pii/S0095447019315207 (27 February, 2022). DOI: http://doi.org/10.1016/S0095-4470(19)31520-7
Storme, Benjamin. 2017. Contrast enhancement motivates closed-syllable laxing and open-syllable tensing. http://ling.auf.net/lingbuzz/003700.
Strange, Winifred. 1987. Information for vowels in formant transitions. Journal of Memory and Language 26(5). 550–557. http://www.sciencedirect.com/science/article/pii/0749596X87901410 (27 April, 2019). DOI: http://doi.org/10.1016/0749-596X(87)90141-0
Strange, Winifred. 1989. Evolving theories of vowel perception. The Journal of the Acoustical Society of America 85(5). 2081–2087. pmid: 2659637. DOI: http://doi.org/10.1121/1.397860
Sylak-Glassman, John Christopher. 2014. Deriving natural classes: The phonology and typology of postvelar consonants.
Topping, Donald & Dungca, Bernadita. 1973. Chamorro reference grammar. University of Hawaii Press. DOI: http://doi.org/10.1515/9780824841263
Zuraw, Kie. 2000. Patterned exceptions in phonology. UCLA PhD Thesis.
Zuraw, Kie. 2010. A model of lexical variation and the grammar with application to Tagalog nasal substitution. Natural Language & Linguistic Theory 28(2). 417–472. http://link.springer.com/10.1007/s11049-010-9095-z (6 April, 2019). DOI: http://doi.org/10.1007/s11049-010-9095-z