<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<!--<?xml-stylesheet type="text/xsl" href="article.xsl"?>-->
<article article-type="research-article" dtd-version="1.1" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id journal-id-type="issn">2397-1835</journal-id>
<journal-title-group>
<journal-title>Glossa: a journal of general linguistics</journal-title>
</journal-title-group>
<issn pub-type="epub">2397-1835</issn>
<publisher>
<publisher-name>Ubiquity Press</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5334/gjgl.826</article-id>
<article-categories>
<subj-group>
<subject>Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Phonotactic restrictions and morphology in Aymara</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Gallagher</surname>
<given-names>Gillian</given-names>
</name>
<email>ggillian@nyu.edu</email>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gouskova</surname>
<given-names>Maria</given-names>
</name>
<xref ref-type="aff" rid="aff-1">1</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Rios</surname>
<given-names>Gladys Camacho</given-names>
</name>
<xref ref-type="aff" rid="aff-2">2</xref>
</contrib>
</contrib-group>
<aff id="aff-1"><label>1</label>New York University, 10 Wash. Pl., NY, US</aff>
<aff id="aff-2"><label>2</label>University of Texas at Austin, Austin, TX, US</aff>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2019-02-18">
<day>18</day>
<month>02</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="collection">
<year>2019</year>
</pub-date>
<volume>4</volume>
<issue>1</issue>
<elocation-id>29</elocation-id>
<history>
<date date-type="received" iso-8601-date="2018-09-29">
<day>29</day>
<month>09</month>
<year>2018</year>
</date>
<date date-type="accepted" iso-8601-date="2018-11-22">
<day>22</day>
<month>11</month>
<year>2018</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright: &#x00A9; 2019 The Author(s)</copyright-statement>
<copyright-year>2019</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="http://www.glossa-journal.org/articles/10.5334/gjgl.826/"/>
<abstract>
<p>Nonlocal phonological interactions are often sensitive to morphological domains. Bolivian Aymara restricts the cooccurrence of plain, ejective, and aspirated stops within, but not across, morphemes. We document these restrictions in a morphologically parsed corpus of Aymara. We further present two experiments with native Aymara speakers. In the first experiment, speakers are asked to repeat nonce words that should be interpreted as monomorphemic. Speakers are more accurate at repeating nonce words that respect the nonlocal phonotactic restrictions than nonce words that violate them. In a second experiment, some nonce words are interpetable as morphologically complex, while others suggest a monomorphemic parse. Speakers show a sensitivity to this difference, and repeat the words more accurately when they can be interpreted as having a morpheme boundary between two consonants that tend to not cooccur inside a morpheme. Finally, we develop a computational model that induces nonlocal representations from the baseline grammar. The model posits projections when it notices that certain segments often cooccur when separated by a morpheme boundary. The model generates a full Maximum Entropy phonotactic grammar, which makes distinctions between attested and rare/unattested sequences in a way that aligns with the speaker behavior.</p>
</abstract>
<kwd-group>
<kwd>inductive learning</kwd>
<kwd>morphologically sensitive phonology</kwd>
<kwd>Aymara</kwd>
<kwd>nonce words</kwd>
<kwd>corpus study</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<title>1 Introduction</title>
<p>One of the challenges for a theory of phonotactics is recognizing that constraints can hold inside morphemes but be lifted at morpheme boundaries (<xref ref-type="bibr" rid="B43">Trubetzkoy 1939</xref>; <xref ref-type="bibr" rid="B11">Chomsky and Halle 1968</xref>; et. seq.). This situation is quite common: for example, in English, the cluster [md] is not found inside morphemes, but it is allowed in suffixed verbs such as &#8220;hemm-ed&#8221;. Likewise, when it comes to nonlocal phonological interactions, some languages respect the relevant constraints in any phonological word, but it seems to be equally if not more common for nonlocal phonotactics to apply differently inside vs. across morphemes. These kinds of patterns present an interesting learnability problem: if a learner attends to phonological words only, the relevant constraints may be violated, so how, if at all, do speakers arrive at the knowledge of correct constraints that hold morpheme-internally?</p>
<p>Our paper investigates this question in a study of nonlocal phonological interactions in Bolivian Aymara. Within Aymara morphemes, plain-aspirate, plain-ejective and heterorganic ejective-ejective combinations are described as restricted (see (1)), though these combinations may arise across morpheme boundaries (see (2)):</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(1)</td>
<td colspan="4"><italic>Aymara laryngeal phonotactics inside morphemes</italic> (<xref ref-type="bibr" rid="B31">MacEachern 1997</xref>)</td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td><bold>p<sup>h</sup></bold>u<bold>t</bold>u</td>
<td>&#8216;heat&#8217;</td>
<td>*<bold>t</bold>u<bold>p<sup>h</sup></bold>u</td>
</tr>
<tr>
<td>&#160;</td>
<td>b.</td>
<td><bold>k&#8217;</bold>a<bold>p</bold>a</td>
<td>&#8216;cartilage&#8217;</td>
<td>*<bold>k</bold>a<bold>p&#8217;</bold>a</td>
</tr>
<tr>
<td>&#160;</td>
<td>c.</td>
<td><bold>k&#8217;</bold>as<bold>k&#8217;</bold>a</td>
<td>&#8216;acid to the taste&#8217;</td>
<td>*<bold>t&#8217;</bold>an<bold>k&#8217;</bold>a</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(2)</td>
<td colspan="3"><italic>Aymara laryngeal phonotactics across morpheme boundaries (our fieldwork data)</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td><bold>p</bold>a&#654;+<bold>t&#8217;</bold>a+&#626;a</td>
<td>&#8216;about to choose&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td><bold>t</bold>i&#626;+<bold>&#679;&#8217;</bold>uki+&#626;a</td>
<td>&#8216;to color carefully&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>b.</td>
<td><bold>&#679;</bold>a&#654;m+<bold>t<sup>h</sup></bold>api+&#626;a</td>
<td>&#8216;to finish chewing&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td><bold>q</bold>a<bold>q</bold>+<bold>t<sup>h</sup></bold>api+&#626;a</td>
<td>&#8216;to finish scratching&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>c.</td>
<td><bold>&#679;&#8217;</bold>um+<bold>t&#8217;</bold>a+&#626;a</td>
<td>&#8216;about to drain&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td><bold>t&#8217;</bold>isn+<bold>&#679;&#8217;</bold>uki+&#626;a</td>
<td>&#8216;to thread carefully&#8217;</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Impressionistic descriptions of phonological patterns are often made more nuanced by explorations of natural language corpora, experimentation with native speakers, and computational modeling. In this paper, we look at phonological generalizations in Bolivian Aymara through these three lenses.</p>
<p>Our examination of a morphologically parsed web corpus partially confirms traditional descriptions in the literature for the plain-ejective and plain-aspirate restrictions: while there are numerous exceptions to the restriction in tautomorphemic combinations, there are far more exceptions heteromorphemically than tautomorphemically. Ejective-ejective combinations, however, are infrequent regardless of morphological context.</p>
<p>Despite the exceptions in the lexicon and the overall infrequency of ejective-ejective combinations, two experiments support the synchronic status of restrictions on plain-ejective and ejective-ejective combinations. Native Aymara speakers make more repetition errors on nonce words that violate the putative restrictions than on control words, and speakers make fewer errors on nonce words when the interacting stops may be interpreted as belonging to different morphemes than when they must be parsed as tautomorphemic.</p>
<p>After establishing the corpus and behavioral evidence for the restrictions, we present a computational model that learns the morphologically sensitive, nonlocal phonotactic restrictions from our corpus. The modeling work shows that while certain aspects of the phonotactic restrictions are observable in an unparsed data set (cf. <xref ref-type="bibr" rid="B32">Martin 2007</xref>), training on a parsed corpus with morpheme boundaries is necessary to capture the full range of patterns in our experiments and the descriptive literature.</p>
<p>The modeling work in this paper expands on the model developed in Gouskova and Gallagher (<xref ref-type="bibr" rid="B24">to appear</xref>). There, we proposed a method for inductively learning nonlocal projections that capitalizes on the observation that nonlocal interactions can be observed in local phonotactics: if X and Y cannot cooccur at longer distances inside a word, they usually cannot be separated by a single segment, either (<xref ref-type="bibr" rid="B42">Suzuki 1998</xref>). Our learner induces nonlocal projections by attending to the properties of the language&#8217;s segment-level phonotactics. In languages with nonlocal phonological interactions, segments within a certain natural class are restricted from cooccurring across an arbitrary amount of intervening material: e.g., in Quechua pairs of ejectives may not cooccur across an intervening vowel *[k&#8217;ap&#8217;i], an intervening vowel and consonant *[k&#8217;amp&#8217;i] or across more material *[k&#8217;amip&#8217;a] (<xref ref-type="bibr" rid="B20">Gallagher 2016</xref>). The arbitrary nature of the intervening segmental material has supported analyses of these patterns that reference an autosegmental <italic>tier</italic> or <italic>projection</italic> where only the interacting segments are visible.<xref ref-type="fn" rid="n1">1</xref> For the Quechua case, this would mean that there is one level of representation in which all segments are visible to the grammar, and another level of representation in which only ejectives are visible; it is on this &#8220;ejective projection&#8221; that the cooccurrence restriction can be stated as a simple bigram *[+cg][+cg].</p>
<p>Hayes and Wilson&#8217;s (<xref ref-type="bibr" rid="B28">2008</xref>) inductive phonotactic learner allows the analyst to define nonlocal projections so the model can learn nonlocal phonology. We propose that nonlocal projections can be learned inductively by analyzing the constraints in a baseline grammar without projections. In languages with nonlocal phonology, the baseline grammar will sometimes include trigram constraints of the form *A-any_segment-B. A trigram of this sort is a clue to the learner that natural classes A and B interact nonlocally, and that the nature of the intervening material is irrelevant. Our original model builds projections based on these trigram constraints, and in this paper we expand the procedure to also induce projections from morpheme boundary trigrams: A-[&#8211;mb]-B, where [&#8211;mb] is the class of all segments but not the morpheme boundary symbol. Intuitively, these constraints will arise in a language where the segments A and B cannot cooccur inside a morpheme (*A-any_segment-B), but occur with some frequency at morpheme boundaries (&#10003;A+B). Constraints of this form indicate that natural classes A and B interact nonlocally, but strictly tautomorphemically. The simulations reported in this paper show that the morphologically sensitive, nonlocal restrictions in Aymara are observable as morpheme boundary trigrams in a parsed corpus, despite the presence of exceptions. We further show that these restrictions cannot be discovered in an unparsed corpus without morphological information, suggesting that Aymara learners may acquire these phonotactic restrictions later in their learning trajectory, only after substantial morphological learning has been accomplished.</p>
<p>The paper is structured as follows. Section 2 summarizes the laryngeal constraints that hold of Aymara words&#8212;we cover the descriptive generalizations in the literature on the language, and present our own study of a web corpus of Aymara. Section 3 presents two experimental studies with Aymara speakers, which test their knowledge of nonlocal phonotactics that hold of morphologically simple words as opposed to complex ones in a nonce word repetition experiment. Section 4 presents our computational model and a simulation that induces nonlocal projections from the web corpus described in Section 2. Section 5 offers some general discussion, and Section 6 concludes the paper.</p>
</sec>
<sec>
<title>2 Laryngeal restrictions in Aymara</title>
<sec>
<title>2.1 Background</title>
<p>The consonant inventory of Aymara contains fifteen stops, exhibiting a three-way laryngeal contrast between plain (voiceless unaspirated), ejective and aspirate at five places of articulation. The full inventory is shown in Table <xref ref-type="table" rid="T1">1</xref> (<xref ref-type="bibr" rid="B31">MacEachern 1997</xref>; <xref ref-type="bibr" rid="B26">Hardman 2001</xref>).</p>
<table-wrap id="T1">
<label>Table 1</label>
<caption>
<p>Consonant inventory of Aymara.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="left" style="background-color:#f3f3f4;">labial</th>
<th align="left" style="background-color:#f3f3f4;">dental</th>
<th align="left" style="background-color:#f3f3f4;">postalveolar</th>
<th align="left" style="background-color:#f3f3f4;">velar</th>
<th align="left" style="background-color:#f3f3f4;">uvular</th>
<th align="left" style="background-color:#f3f3f4;">glottal</th>
</tr>
<tr>
<td colspan="7"><hr/></td>
</tr>
<tr>
<td align="left">plain</td>
<td align="left">p</td>
<td align="left">t</td>
<td align="left">&#679;</td>
<td align="left">k</td>
<td align="left">q</td>
<td align="left"></td>
</tr>
<tr>
<td align="left">ejective</td>
<td align="left">p&#8217;</td>
<td align="left">t&#8217;</td>
<td align="left">&#679;&#8217;</td>
<td align="left">k&#8217;</td>
<td align="left">q&#8217;</td>
<td align="left"></td>
</tr>
<tr>
<td align="left">aspirate</td>
<td align="left">p&#688;</td>
<td align="left">t&#688;</td>
<td align="left">&#679;&#688;</td>
<td align="left">k&#688;</td>
<td align="left">q&#688;</td>
<td align="left"></td>
</tr>
<tr>
<td align="left">fricative</td>
<td align="left"></td>
<td align="left">s</td>
<td align="left">&#643;</td>
<td align="left"></td>
<td align="left">&#967;</td>
<td align="left">h</td>
</tr>
<tr>
<td align="left">nasal</td>
<td align="left">m</td>
<td align="left">n</td>
<td align="left">&#626;</td>
<td align="left"></td>
<td align="left"></td>
<td align="left"></td>
</tr>
<tr>
<td align="left">liquid</td>
<td align="left"></td>
<td align="left">l &#638;</td>
<td align="left">&#654;</td>
<td align="left"></td>
<td align="left"></td>
<td align="left"></td>
</tr>
<tr>
<td align="left">glide</td>
<td align="left">w</td>
<td align="left"></td>
<td align="left">j</td>
<td align="left"></td>
<td align="left"></td>
<td align="left"></td>
</tr>
</table>
</table-wrap>
<p>The distribution of ejective and aspirate stops is restricted within morphemes, both inside suffixes and in roots, which we exemplify in (3). As shown in (3a), ejectives and aspirates may appear in either initial or medial position of roots, which are primarily CV(C)CV. Both ejectives and aspirates are rare in roots with an initial plain stop, however (see (3b)). Pairs of heterorganic ejectives are also rare, though forms with identical ejectives are attested (see (3c)). Other combinations of ejectives and aspirates are attested (see (3d)), though see Section 4.4.4 below for further details. Examples are from de Lucca (<xref ref-type="bibr" rid="B17">1987</xref>), and these and other patterns are also discussed in detail in MacEachern (<xref ref-type="bibr" rid="B31">1997</xref>) and Bennett (<xref ref-type="bibr" rid="B8">2013</xref>).</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(3)</td>
<td colspan="5"><italic>Aymara ejective and aspirate distribution</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td>&#160;&#160;&#679;<sup>h</sup>aku</td>
<td>&#8216;coarse&#8217;</td>
<td>&#160;&#160;k&#8217;a&#679;a</td>
<td>&#8216;voice&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>&#160;&#160;laq<sup>h</sup>a</td>
<td>&#8216;darkness&#8217;</td>
<td>&#160;&#160;hajp&#8217;u</td>
<td>&#8216;evening&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>b.</td>
<td>*paq<sup>h</sup>a</td>
<td>&#160;</td>
<td>*kajp&#8217;u</td>
<td>&#160;</td>
</tr>
<tr>
<td>&#160;</td>
<td>c.</td>
<td>&#160;&#160;p&#8217;ap&#8217;i</td>
<td>&#8216;roasted fish&#8217;</td>
<td>*k&#8217;ap&#8217;i</td>
<td>&#160;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>&#160;&#160;p<sup>h</sup>u&#654;&#679;&#8217;u</td>
<td>&#8216;bag&#8217;</td>
<td>&#160;&#160;t<sup>h</sup>ak<sup>h</sup>i</td>
<td>&#8216;road&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>&#160;&#160;k&#8217;amp<sup>h</sup>i</td>
<td>&#8216;tip over&#8217;</td>
<td>&#160;</td>
<td>&#160;</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The three laryngeal combinations that are underattested inside morphemes &#8211; plain-ejective, plain-aspirate and ejective-ejective &#8211; are attested in words. These combinations arise when suffixes with an ejective or aspirate consonant combine with roots with a plain or ejective stop. Some examples are given in (4), from work with a native speaker consultant in El Alto, Bolivia. These examples involve three verbal suffixes, [-t&#8217;a] &#8216;about to&#8217;, [-&#679;&#8217;uki] &#8216;carefully, continuously&#8217;, and [-t&#688;api] &#8216;finish&#8217;. All three of these suffixes trigger syncope (deletion of the root-final vowel).<xref ref-type="fn" rid="n2">2</xref></p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(4)</td>
<td colspan="5"><italic>Ejectives and aspirates in morphologically complex words in Aymara</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td>pa&#654;+t&#8217;a+&#626;a</td>
<td>&#8216;about to choose&#8217;</td>
<td>ti&#626;+&#679;&#8217;uki+&#626;a</td>
<td>&#8216;to color carefully&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>taw+t&#8217;a+&#626;a</td>
<td>&#8216;about to row&#8217;</td>
<td>pump+&#679;&#8217;uki+&#626;a</td>
<td>&#8216;to mix carefully&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>b.</td>
<td>&#679;a&#654;m+t<sup>h</sup>api+&#626;a</td>
<td>&#8216;to finish chewing&#8217;</td>
<td>qaq+t<sup>h</sup>api+&#626;a</td>
<td>&#8216;to finish scratching&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>c.</td>
<td>&#679;&#8217;um+t&#8217;a+&#626;a</td>
<td>&#8216;about to drain&#8217;</td>
<td>t&#8217;isn+&#679;&#8217;uki+&#626;a</td>
<td>&#8216;to thread carefully&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>q&#8217;e&#967;+t&#8217;a+&#626;a</td>
<td>&#8216;about to whip&#8217;</td>
<td>k&#8217;u&#626;+&#679;&#8217;uki+&#626;a</td>
<td>&#8216;to bend over continuously&#8217;</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>There are several exceptions to the restrictions on tautomorphemic ejective-ejective, plain-ejective and plain-aspirate combinations. These are given in (5). Some of these exceptions occur at the level of a trigram on the linear string &#8211; important for our model &#8211; while others occur across more intervening material and would be noticeable only when looking at a nonlocal projection. Additionally, there are four combinations of plain-ejective and two combinations of plain-aspirate that occur in clusters. These are reported by our consultant to be monomorphemic forms, though Hardman claims that root-internal stop codas are not found. The morphological structure of these forms is thus in question. An additional observation is that several of the exceptions end in the sequence [t&#8217;a], just like the productive suffix.</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(5)</td>
<td colspan="6"><italic>Exceptions to tautomorphemic restrictions</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td>tap<sup>h</sup>ijala</td>
<td>&#8216;earthen wall&#8217;</td>
<td>q<sup>h</sup>a&#679;q<sup>h</sup>a</td>
<td colspan="2">&#8216;rough to the touch&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>kawk<sup>h</sup>a</td>
<td>&#8216;where&#8217;</td>
<td>&#679;<sup>h</sup>ap&#679;<sup>h</sup>a</td>
<td colspan="2">&#8216;mediocre&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>b.</td>
<td>pist&#8217;a</td>
<td>&#8216;scarcity&#8217;</td>
<td>&#654;upt&#8217;a</td>
<td>&#8216;bribery&#8217;</td>
<td>lupt&#8217;a</td>
<td>&#8216;when it is very hot&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>qa&#679;&#8217;i</td>
<td>&#8216;type of potato&#8217;</td>
<td>loqt&#8217;a</td>
<td>&#8216;scope&#8217;</td>
<td>uk&#679;&#8217;a</td>
<td>&#8216;height&#8217;</td>
</tr>
<tr>
<td>&#160;</td>
<td>c.</td>
<td>q&#8217;ewt&#8217;a</td>
<td>&#8216;curve, angle&#8217;</td>
<td>&#160;</td>
<td>&#160;</td>
<td>&#160;</td>
<td>&#160;</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>2.2 Descriptive lexical statistics</title>
<p>To assess the statistical evidence available to an inductive learner trying to acquire these restrictions, we looked at the observed combinations of all three series of stops. Our data set is a morphologically segmented word list. This list was compiled by taking an unsegmented web corpus of 88,728 forms, collected from 438 webpages by An Cr&#250;bad&#225;n (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://crubadan.org/">http://crubadan.org/</ext-link>).<xref ref-type="fn" rid="n3">3</xref> The corpus was converted to lowercase and cleaned to remove numbers, non-alphanumeric characters and English or Spanish forms. Forms were also removed if they contained stray apostrophes, hyphens or other typos that we couldn&#8217;t interpet. This list was then crossed with a list of 1846 roots, derived from the de Lucca (<xref ref-type="bibr" rid="B17">1987</xref>) dictionary with the help of a native speaker consultant, and a list of 50 suffixes and their allomorphs from the Hardman (<xref ref-type="bibr" rid="B26">2001</xref>) grammar and the de Lucca (<xref ref-type="bibr" rid="B17">1987</xref>) dictionary. There were 46,164 forms that were divisible into a known root and known suffixes, and these forms comprise the corpus we use in this paper.</p>
<p>Before looking at combinations of stops directly, we report on the distribution of stops by position in our word corpus, comparing the number of each class of stops in root initial position, root medial position, and in a suffix. Table <xref ref-type="table" rid="T2">2</xref> gives the raw counts on the left (e.g., there are 9,113 plain stops which are in root-initial position in the word corpus), as well as the probability of a stop from the given class in the given position (e.g., 20% of our 1,846 roots begin with a plain stop, 7% with an aspirate and 7% with an ejective, the remaining 66% of roots begin with a vowel, fricative or sonorant consonant). These numbers show that plain stops are frequent in both roots and suffixes, while aspirates and ejectives are both much more frequent in roots than in suffixes.</p>
<table-wrap id="T2">
<label>Table 2</label>
<caption>
<p>Observed occurrences and probability of stops in three positions.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="center" style="background-color:#f3f3f4;">root initial</th>
<th align="center" style="background-color:#f3f3f4;">root medial</th>
<th align="center" style="background-color:#f3f3f4;">suffix</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">plain</td>
<td align="right">9,113 (0.20)</td>
<td align="right">17,441 (0.12)</td>
<td align="right">65,148 (0.23)</td>
</tr>
<tr>
<td align="left">aspirate</td>
<td align="right">3,007 (0.07)</td>
<td align="right">2,871 (0.02)</td>
<td align="right">817 (&lt;0.01)</td>
</tr>
<tr>
<td align="left">ejective</td>
<td align="right">3,037 (0.07)</td>
<td align="right">2,622 (0.02)</td>
<td align="right">1,901 (&lt;0.01)</td>
</tr>
</table>
</table-wrap>
<p>Tables <xref ref-type="table" rid="T3">3</xref> and <xref ref-type="table" rid="T4">4</xref> report the observed counts for stop combinations in tautomorphemic and hetermorphemic strings, on both the baseline and a nonlocal projection containing only stops. For tautomorphemic sequences, we looked at the cooccurrence of nonadjacent stops in a baseline trigram &#8211; that is, of a C<sub>1</sub>XC<sub>2</sub> string, where X can be any segment except the morpheme boundary symbol &#8211; and for adjacent bigrams on the stop projection. For hetermorphemic sequences, we looked at trigrams where the medial gram was the morpheme boundary on the baseline and the stop projection. In all tables, &#8220;ejective-ejective&#8221; refers to counts made over only non-identical combinations; all other combinations represent both identical (where applicable) and non-identical combinations.</p>
<table-wrap id="T3">
<label>Table 3</label>
<caption>
<p>Corpus counts for tautomorphemic and heteromorphemic stop combinations in a baseline trigram. Examples are schematic and ellipses represent any additional material.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="center" style="background-color:#f3f3f4;">tautomorph.</th>
<th align="left" style="background-color:#f3f3f4;"><italic>example</italic></th>
<th align="center" style="background-color:#f3f3f4;">heteromorph.</th>
<th align="left" style="background-color:#f3f3f4;"><italic>example</italic></th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left">plain-aspirate</td>
<td align="right">0</td>
<td align="left">&#8230;pat&#688;a&#8230;</td>
<td align="right">149</td>
<td align="left">&#8230;lip+t&#688;a&#8230;</td>
</tr>
<tr>
<td align="left">plain-ejective</td>
<td align="right">0</td>
<td align="left">&#8230;pat&#8217;a&#8230;</td>
<td align="right">659</td>
<td align="left">&#8230;lip+t&#8217;a&#8230;</td>
</tr>
<tr>
<td align="left">ejective-ejective (non-identical)</td>
<td align="right">0</td>
<td align="left">&#8230;p&#8217;at&#8217;a&#8230;</td>
<td align="right">1</td>
<td align="left">&#8230;lip&#8217;+t&#8217;a&#8230;</td>
</tr>
<tr>
<td align="left">plain-plain</td>
<td align="right">3532</td>
<td align="left">&#8230;pata&#8230;</td>
<td align="right">3673</td>
<td align="left">&#8230;lip+ta&#8230;</td>
</tr>
<tr>
<td align="left">aspirate-plain</td>
<td align="right">683</td>
<td align="left">&#8230;p&#688;ata&#8230;</td>
<td align="right">30</td>
<td align="left">&#8230;lip&#688;+ta&#8230;</td>
</tr>
<tr>
<td align="left">ejective-plain</td>
<td align="right">765</td>
<td align="left">&#8230;p&#8217;ata&#8230;</td>
<td align="right">46</td>
<td align="left">&#8230;lip&#8217;+ta&#8230;</td>
</tr>
<tr>
<td align="left">aspirate-aspirate</td>
<td align="right">613</td>
<td align="left">&#8230;p&#688;at&#688;a&#8230;</td>
<td align="right">0</td>
<td align="left">&#8230;lip&#688;+t&#688;a&#8230;</td>
</tr>
<tr>
<td align="left">ejective-aspirate</td>
<td align="right">466</td>
<td align="left">&#8230;p&#8217;at&#688;a&#8230;</td>
<td align="right">4</td>
<td align="left">&#8230;lip&#8217;+t&#688;a&#8230;</td>
</tr>
<tr>
<td align="left">aspirate-ejective</td>
<td align="right">38</td>
<td align="left">&#8230;p&#688;at&#8217;a&#8230;</td>
<td align="right">2</td>
<td align="left">&#8230;lip&#688;+t&#8217;a&#8230;</td>
</tr>
</table>
</table-wrap>
<table-wrap id="T4">
<label>Table 4</label>
<caption>
<p>Corpus counts for tautomorphemic and heteromorphemic stop combinations on a stop projection. Examples represent cases where stops are adjacent on a projection but non-adjacent on the baseline.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="center" style="background-color:#f3f3f4;">tautomorph.</th>
<th align="left" style="background-color:#f3f3f4;"><italic>example</italic></th>
<th align="center" style="background-color:#f3f3f4;">hetermorph.</th>
<th align="left" style="background-color:#f3f3f4;"><italic>example</italic></th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left">plain-aspirate</td>
<td align="right">61</td>
<td align="left">&#8230;past&#688;a&#8230;</td>
<td align="right">261</td>
<td align="left">&#8230;pas+t&#688;a&#8230;</td>
</tr>
<tr>
<td align="left">plain-ejective</td>
<td align="right">68</td>
<td align="left">&#8230;past&#8217;a&#8230;</td>
<td align="right">668</td>
<td align="left">&#8230;pas+t&#8217;a&#8230;</td>
</tr>
<tr>
<td align="left">ejective-ejective (non-identical)</td>
<td align="right">4</td>
<td align="left">&#8230;p&#8217;ast&#8217;a&#8230;</td>
<td align="right">7</td>
<td align="left">&#8230;p&#8217;as+t&#8217;a&#8230;</td>
</tr>
<tr>
<td align="left">plain-plain</td>
<td align="right">5389</td>
<td align="left">&#8230;pasta&#8230;</td>
<td align="right">23519</td>
<td align="left">&#8230;pas+ta&#8230;</td>
</tr>
<tr>
<td align="left">aspirate-plain</td>
<td align="right">943</td>
<td align="left">&#8230;p&#688;asta&#8230;</td>
<td align="right">1262</td>
<td align="left">&#8230;p&#688;as+ta&#8230;</td>
</tr>
<tr>
<td align="left">ejective-plain</td>
<td align="right">906</td>
<td align="left">&#8230;p&#8217;asta&#8230;</td>
<td align="right">1781</td>
<td align="left">&#8230;p&#8217;as+ta&#8230;</td>
</tr>
<tr>
<td align="left">aspirate-aspirate</td>
<td align="right">712</td>
<td align="left">&#8230;p&#688;ast&#688;a&#8230;</td>
<td align="right">1</td>
<td align="left">&#8230;p&#688;as+t&#688;a&#8230;</td>
</tr>
<tr>
<td align="left">ejective-aspirate</td>
<td align="right">476</td>
<td align="left">&#8230;p&#8217;ast&#688;a&#8230;</td>
<td align="right">22</td>
<td align="left">&#8230;p&#8217;as+t&#688;a&#8230;</td>
</tr>
<tr>
<td align="left">aspirate-ejective</td>
<td align="right">38</td>
<td align="left">&#8230;p&#688;ast&#8217;a&#8230;</td>
<td align="right">29</td>
<td align="left">&#8230;p&#688;as+t&#8217;a&#8230;</td>
</tr>
</table>
</table-wrap>
<p>Table <xref ref-type="table" rid="T3">3</xref> shows that, within a baseline trigram, the restricted combinations are unattested tautomorphemically. Ejective-ejective combinations are also nearly unattested in heteromorphemic contexts, but plain-aspirate and plain-ejective combinations are more frequent across a morpheme boundary. The bottom portion of the table shows that other combinations of stops are either frequent in both heteromorphemic and tautomorphemic contexts, or are more frequent tautomorphemically than heteromorphemically (note that while ejectives and aspirates may occur in suffixes, the numbers here reflect the rarity of such suffixes in the corpus as a whole). As noted in Section 2.1 above, vowels in both roots and suffixes syncopate via affixation, creating consonant clusters across morpheme boundaries. This is crucial to the distinction between tautomorphemic and heteromorphemic contexts in Table <xref ref-type="table" rid="T3">3</xref>. The plain-aspirate, plain-ejective and ejective-ejective combinations that occur in a heteromorphemic trigram are actually adjacent in the linear string, since the morpheme boundary symbol constitutes the medial gram in the trigram. If Aymara did not have syncope, heteromorphemic combinations would only be noticeable in a tetragram, compare actual [qa<bold>q+t&#688;</bold>api+&#626;a] &#8216;to finish scratching&#8217; to hypothetical [qa<bold>qa+t&#688;</bold>api+&#626;a]. We will return to this point below.</p>
<p>Table <xref ref-type="table" rid="T4">4</xref> gives the counts on a stop projection. Here, both [<bold>p&#8217;</bold>a<bold>t&#8217;</bold>] and [<bold>p&#8217;</bold>an<bold>t&#8217;</bold>] would count as ejective-ejective combinations. Plain-aspirate and plain-ejective combinations are still much more frequent in hetermorphemic than tautomorphemic contexts, though there are substantially more exceptions tautomorphemically than are observable in a baseline trigram. Ejective-ejective combinations are again rare in both contexts. The bottom portion of the table shows that most combinations (plain-plain, aspirate-plain, ejective-plain) are well attested in both morphological contexts, while aspirate-ejective combinations are somewhat rare in both contexts. Aspirate-aspirate and ejective-aspirate combinations are both more frequent in tautomorphemic combinations, again due to the general rarity of aspirates in suffixes.</p>
<p>The numbers here show that the restrictions on plain-ejective and plain-aspirate combinations in descriptive grammars are supported in counts over a word corpus, at both the baseline trigram level and on a nonlocal projection including only stops. The restrictions on ejective-ejective combinations are more difficult to assess, due to the rarity of heteromorphemic combinations (though our consultant work shows that these are possible, if not frequent in the corpus). The other six combinations of stops are reported to be licit in all morphological contexts, and this appears to be essentially true in our corpus as well, though certain combinations are unattested or nearly unattested (see Section 4.4.4 for discussion of other restricted combinations).</p>
<p>The experiments presented in the next section look at how native speakers of Aymara treat forms with plain-ejective and ejective-ejective combinations, in both heteromorphemic and tautomorphemic contexts. We then present the results of our computational model in Section 4, showing how the counts presented above are reflected in an inductive phonotactic grammar.</p>
</sec>
</sec>
<sec>
<title>3 Experimental work</title>
<p>The two experiments reported below provide behavioral evidence that speakers of Aymara have learned the laryngeal restrictions on ejectives, and that treatment of these phonotactic structures is sensitive to morphological structure.</p>
<sec>
<title>3.1 Experiment 1: sensitivity to restrictions on ejectives</title>
<p>Experiment 1 presents Aymara speakers with simple disyllabic forms, which could be interpreted as pseudo-nouns, that contain plain-ejective or ejective-ejective combinations. Speakers&#8217; errors in repeating such forms are evaluated to assess whether speakers have internalized phonotactic restrictions on these combinations.</p>
<sec>
<title>3.1.1 Participants</title>
<p>The participants were 21 native Aymara speakers, all Spanish bilinguals. All were college educated and resided in El Alto, Bolivia, and most were students at Universidad P&#250;blica de El Alto. There were seven male and fourteen female participants, aged 19&#8211;30. Fifteen participants reported that they had been speaking Aymara since birth, five learned Aymara between the ages of 4 and 7, and one at 12.</p>
</sec>
<sec sec-type="methods">
<title>3.1.2 Methods</title>
<p><italic>Stimuli</italic> The stimuli were disyllabic C<sub>1</sub>VC<sub>2</sub>V nonce forms. Control items had a phonotactically legal ejective in C<sub>2</sub> and a fricative or sonorant in C<sub>1</sub>. Ejective-ejective items contained a putative phonotactic violation by having heterorganic ejectives in C<sub>1</sub> and C<sub>2</sub> and plain-ejective items contained a putatively restricted combination of a plain stop in C<sub>1</sub> and an ejective in C<sub>2</sub>. There were fifteen items of each type, and an additional fifteen, phonotactically legal fillers with a plain stop in C<sub>2</sub>, for a total of 60 items. The complete list of stimuli is shown in Table <xref ref-type="table" rid="T5">5</xref>.</p>
<table-wrap id="T5">
<label>Table 5</label>
<caption>
<p>Stimuli for Experiment 1.</p>
</caption>
<table>
<tr>
<th align="center" style="background-color:#f3f3f4;" colspan="3">control</th>
<th align="center" style="background-color:#f3f3f4;" colspan="3">ejective-ejective</th>
<th align="center" style="background-color:#f3f3f4;" colspan="3">plain-ejective</th>
<th align="center" style="background-color:#f3f3f4;" colspan="3">filler</th>
</tr>
<tr>
<td colspan="12"><hr/></td>
</tr>
<tr>
<td align="left">lap&#8217;a</td>
<td align="left">saq&#8217;o</td>
<td align="left">nut&#8217;a</td>
<td align="left">k&#8217;it&#8217;a</td>
<td align="left">p&#8217;ik&#8217;a</td>
<td align="left">k&#8217;ap&#8217;u</td>
<td align="left">tip&#8217;i</td>
<td align="left">tuk&#8217;i</td>
<td align="left">kip&#8217;a</td>
<td align="left">t&#8217;apu</td>
<td align="left">kupa</td>
<td align="left">lipu</td>
</tr>
<tr>
<td align="left">jup&#8217;a</td>
<td align="left">moq&#8217;o</td>
<td align="left">yap&#8217;i</td>
<td align="left">p&#8217;it&#8217;a</td>
<td align="left">t&#8217;oq&#8217;e</td>
<td align="left">k&#8217;ut&#8217;a</td>
<td align="left">kut&#8217;a</td>
<td align="left">toq&#8217;e</td>
<td align="left">kap&#8217;i</td>
<td align="left">k&#8217;api</td>
<td align="left">tipu</td>
<td align="left">napu</td>
</tr>
<tr>
<td align="left">juk&#8217;u</td>
<td align="left">lip&#8217;u</td>
<td align="left">lik&#8217;a</td>
<td align="left">q&#8217;ap&#8217;i</td>
<td align="left">t&#8217;ap&#8217;u</td>
<td align="left">k&#8217;up&#8217;i</td>
<td align="left">pit&#8217;a</td>
<td align="left">kup&#8217;a</td>
<td align="left">tip&#8217;a</td>
<td align="left">k&#8217;ati</td>
<td align="left">kipi</td>
<td align="left">natu</td>
</tr>
<tr>
<td align="left">juk&#8217;a</td>
<td align="left">nap&#8217;u</td>
<td align="left">luk&#8217;a</td>
<td align="left">k&#8217;ip&#8217;a</td>
<td align="left">k&#8217;ip&#8217;i</td>
<td align="left">p&#8217;uk&#8217;a</td>
<td align="left">kip&#8217;u</td>
<td align="left">kap&#8217;a</td>
<td align="left">puk&#8217;i</td>
<td align="left">k&#8217;upi</td>
<td align="left">kapu</td>
<td align="left">japi</td>
</tr>
<tr>
<td align="left">seq&#8217;a</td>
<td align="left">nat&#8217;u</td>
<td align="left">maq&#8217;o</td>
<td align="left">q&#8217;at&#8217;a</td>
<td align="left">q&#8217;op&#8217;i</td>
<td align="left">t&#8217;aq&#8217;o</td>
<td align="left">qat&#8217;i</td>
<td align="left">tip&#8217;u</td>
<td align="left">taq&#8217;e</td>
<td align="left">p&#8217;uka</td>
<td align="left">puki</td>
<td align="left">luka</td>
</tr>
</table>
</table-wrap>
<p>The stimuli were made from recordings of a native Aymara speaking consultant reading phonotactically legal nonce words. The stimuli were made by splicing together C<sub>1</sub>V and C<sub>2</sub>V during the closure of the second stop, e.g., [lap&#8217;a] was made by splicing [lapa] and [map&#8217;a] together during the labial closure. All stimuli were normalized for amplitude, but were otherwise unmodified.</p>
<p><italic>Procedure</italic> Participants were seated in front of a laptop computer wearing AudioTechnica noise cancelling headphones. The stimuli were presented using PsyScope (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://psy.ck.sissa.it/">http://psy.ck.sissa.it/</ext-link>). On each trial, the audio stimulus was played once and participants were asked to repeat what they heard as precisely as possible. Participants were told that the words they would hear were not real words of Aymara, though they would contain sounds familiar from Aymara. Once participants had repeated the item, they pressed any key on the keyboard to move on to the next trial. No orthographic representation of the stimuli was given.</p>
<p><italic>Analysis</italic> The audio recordings of participants&#8217; responses were transcribed, and coded for accuracy and type of error, if any. Errors on ejective-ejective and plain-ejective items were then further classified as repairs or non-repairs, depending on whether they removed the putative phonotactic violation or not.</p>
</sec>
<sec>
<title>3.1.3 Results</title>
<p><italic>Accuracy</italic> Overall accuracy differed between control, ejective-ejective and plain-ejective items, as shown in Figure <xref ref-type="fig" rid="F1">1</xref>. Accuracy on control items was very high (97%), while accuracy on ejective-ejective and plain-ejective items was lower, consistent with a difference in phonotactic legality. Items with an ejective-ejective combination were repeated accurately more often than items with a plain-ejective combination (50% vs. 32%).</p>
<fig id="F1">
<label>Figure 1</label>
<caption>
<p>Accuracy on control, ejective-ejective and plain-ejective forms in Experiment 1. Open circles indicate an individual participant&#8217;s performance; boxplots show summary statistics across all participants.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65072/"/>
</fig>
<p>All trials were coded for accuracy (correct or incorrect), and a binomial, linear mixed model was then fit with accuracy as the dependent variable, a ternary predictor of type, a random effect of type by participant and a random slope by participant. Ejective-ejective was set as the baseline to which the other two factor levels, plain-ejective and control, were compared. The model was fit using the lmer function in the <italic>lme4</italic> package (<xref ref-type="bibr" rid="B4">Bates et al. 2014</xref>) for R (<xref ref-type="bibr" rid="B39">R Development Core Team 2018</xref>, <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.r-project.org/">https://www.r-project.org/</ext-link>). Both comparisons were significant. Accuracy on control is significantly higher than accuracy on ejective-ejective (<italic>&#946;</italic> = 4.81, SE = 1.02, z = 4.72, p &lt; 0.0001), and accuracy on plain-ejective stimuli is significantly lower than on ejective-ejective stimuli (<italic>&#946;</italic> = &#8211;1.03, SE = 0.44, z = &#8211;2.31, p = 0.02).</p>
<p>To allow comparison with Experiment 2 below, the results of Experiment 1 were also analyzed for an effect of place of articulation. Plain-ejective and ejective-ejective trials were coded for whether the medial ejective was dental or not (labial, velar or uvular), and a model was fit to accuracy (correct or incorrect) with type (plain-ejective or ejective-ejective), place (dental or not) and their interaction as predictors. The model had a random intercept by participant and a random slope for place (a model with random slopes for place and type failed to converge). The interaction between type and place was not significant, so it was removed from the model. In the model without the interaction, the main effect of type was again significant (accuracy on plain-ejective stimuli is lower than on ejective-ejective stimuli, <italic>&#946;</italic> = &#8211;0.90, SE = 0.19, z = &#8211;4.83, p &lt; 0.0001), and the model also found a main effect of place. Accuracy on forms with a dental ejective in C<sub>2</sub> is higher than on forms with a non-dental (<italic>&#946;</italic> = 0.67, SE = 0.29, z = &#8211;2.30, p = 0.02).</p>
<p><italic>Errors</italic> The frequency of different errors on ejective-ejective and plain-ejective stimuli the frequency summarized in Table <xref ref-type="table" rid="T6">6</xref> and Table <xref ref-type="table" rid="T7">7</xref>, distinguishing between repair and non-repair errors.<xref ref-type="fn" rid="n4">4</xref></p>
<table-wrap id="T6">
<label>Table 6</label>
<caption>
<p>Errors on ejective-ejective stimuli.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="left" style="background-color:#f3f3f4;">error</th>
<th align="left" style="background-color:#f3f3f4;">example</th>
<th align="center" style="background-color:#f3f3f4;">% of responses</th>
<th align="center" style="background-color:#f3f3f4;">(n)</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="5">repair</td>
<td align="left">C<sub>2</sub> de-ejectivization</td>
<td align="left">k&#8217;ap&#8217;u &#8594; k&#8217;apu</td>
<td align="right">39%</td>
<td align="right">(121)</td>
</tr>
<tr>
<td align="left">C<sub>1</sub> and C<sub>2</sub> de-ejectivization</td>
<td align="left">k&#8217;ap&#8217;u &#8594; kapu</td>
<td align="right">2 %</td>
<td align="right">(6)</td>
</tr>
<tr>
<td align="left">C<sub>1</sub> deletion</td>
<td align="left">k&#8217;ap&#8217;u &#8594; ap&#8217;u</td>
<td align="right">0.5%</td>
<td align="right">(1)</td>
</tr>
<tr>
<td align="left">C<sub>2</sub> aspiration</td>
<td align="left">k&#8217;ap&#8217;u &#8594; k&#8217;ap&#688;u</td>
<td align="right">1%</td>
<td align="right">(4)</td>
</tr>
<tr>
<td align="left"><italic>total</italic></td>
<td align="left"></td>
<td align="right"><italic>42.5%</italic></td>
<td align="right"></td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">non-repair</td>
<td align="left">C<sub>1</sub> de-ejectivization</td>
<td align="left">k&#8217;ap&#8217;u &#8594; kap&#8217;u</td>
<td align="right">4%</td>
<td align="right">(12)</td>
</tr>
<tr>
<td align="left">labial &#8594; dental</td>
<td align="left">k&#8217;ap&#8217;u &#8594; k&#8217;at&#8217;u</td>
<td align="right">3.5%</td>
<td align="right">(11)</td>
</tr>
<tr>
<td align="left"><italic>total</italic></td>
<td align="left"></td>
<td align="right"><italic>7.5%</italic></td>
<td align="right"></td>
</tr>
</table>
</table-wrap>
<table-wrap id="T7">
<label>Table 7</label>
<caption>
<p>Errors on plain-ejective stimuli.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="left" style="background-color:#f3f3f4;">error</th>
<th align="left" style="background-color:#f3f3f4;">example</th>
<th align="center" style="background-color:#f3f3f4;">% of responses</th>
<th align="center" style="background-color:#f3f3f4;">(n)</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="4">repair</td>
<td align="left">C<sub>2</sub> de-ejectivization</td>
<td align="left">kap&#8217;u &#8594; kapu</td>
<td align="right">12%</td>
<td align="right">(37)</td>
</tr>
<tr>
<td align="left">ejective reassociation</td>
<td align="left">kap&#8217;u &#8594; k&#8217;apu</td>
<td align="right">28%</td>
<td align="right">(86)</td>
</tr>
<tr>
<td align="left">C<sub>1</sub> change</td>
<td align="left">kap&#8217;u &#8594; hap&#8217;u</td>
<td align="right">5%</td>
<td align="right">(16)</td>
</tr>
<tr>
<td align="left"><italic>total</italic></td>
<td align="left"></td>
<td align="right"><italic>45%</italic></td>
<td align="right"></td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="4">non-repair</td>
<td align="left">ejective doubling</td>
<td align="left">kap&#8217;u &#8594; k&#8217;ap&#8217;u</td>
<td align="right">22%</td>
<td align="right">(66)</td>
</tr>
<tr>
<td align="left">labial &#8594; dental</td>
<td align="left">kap&#8217;u &#8594; kat&#8217;u</td>
<td align="right">1%</td>
<td align="right">(3)</td>
</tr>
<tr>
<td align="left">C<sub>2</sub> aspiration</td>
<td align="left">kap&#8217;u &#8594; kap&#688;u</td>
<td align="right">1%</td>
<td align="right">(2)</td>
</tr>
<tr>
<td align="left"><italic>total</italic></td>
<td align="left"></td>
<td align="right"><italic>24%</italic></td>
<td align="right"></td>
</tr>
</table>
</table-wrap>
<p>Errors on both types of stimuli remove the putative phonotactic violation more often than not, though ejective doubling errors are quite common on plain-ejective stimuli. Looking at the distribution of errors, we can see that the difference in accuracy between ejective-ejective and plain-ejective stimuli does not stem from a difference in how often these structures are repaired; repair rates for the two stimulus types are comparable. Instead, the lower accuracy on plain-ejective forms overall is driven by the greater rate of non-repair errors; see Gallagher (<xref ref-type="bibr" rid="B20">2016</xref>) for further discussion of this type of error. The relevance of errors that map a labial ejective to a dental will be discussed in conjunction with the results of Experiment 2 in Section 3.3.</p>
</sec>
<sec>
<title>3.1.4 Discussion</title>
<p>The results of Experiment 1 support the status of both ejective-ejective and plain-ejective combinations as synchronically restricted in Aymara. Forms that violate these restrictions are repeated significantly less accurately than phonotactically legal controls.</p>
<p>There is also an effect of place of articulation, with higher accuracy on forms with medial dental ejectives. As described above, Aymara has a productive suffix [t&#8217;a] that may result in plain-ejective or ejective-ejective combinations at the word level. While the stimuli in Experiment 1 had the shape of bare roots &#8211; as opposed to the morphologically complex forms in Experiment 2 &#8211; the greater accuracy on forms with a dental ejective reflects the likelihood of a dental in C<sub>2</sub> when plain-ejective and ejective-ejective combinations occur at the word level. The independent roles of place of articulation and morphological structure will be discussed further below, by comparing the results of Experiment 1 and Experiment 2.</p>
</sec>
</sec>
<sec>
<title>3.2 Experiment 2: Laryngeal restrictions and morphology structure</title>
<p>The goal of Experiment 2 is to test whether speakers&#8217; errors on ejective-ejective and plain-ejective combinations are influenced by morphological structure, and whether these restrictions hold across more than a single intervening vowel. Experiment 2 presents participants with the same kinds of phonotactically illegal structures &#8211; ejective-ejective and plain-ejective pairs &#8211; as in Experiment 1, but in Experiment 2 the nonce words are pseudo-verbs ending in the infinitval suffix [-&#626;a]. The experiment compares performance on stimuli where the illegal ejective in C<sub>2</sub> must be interpreted as part of the root vs. stimuli where the illegal ejective in C<sub>2</sub> may be interpreted as part of a productive suffix [t&#8217;a]. Additionally, the pseudo-roots in Experiment 2 all contain a coda consonant, so the interacting consonants are separated by a VC sequence as opposed to the single V in Experiment 1.</p>
<sec>
<title>3.2.1 Participants</title>
<p>The participants were 20 of the participants from Experiment 1 (data from one participant were accidentally not recorded). Participants were balanced as to whether they completed Experiment 1 or Experiment 2 first.</p>
</sec>
<sec sec-type="methods">
<title>3.2.2 Methods</title>
<p><italic>Stimuli</italic> The stimuli were trisyllabic pseudo-verbs, all ending in the infinitival suffix [-&#626;a]. The pseudo-verb stem was C<sub>1</sub>V<sub>1</sub>CC<sub>2</sub>V<sub>2</sub>, where C<sub>1</sub> was either a plain stop or an ejective and C<sub>2</sub>V<sub>2</sub> was either [p&#8217;a] or [t&#8217;a]. Forms with [t&#8217;a] were plausibly polymorphemic, while forms with [p&#8217;a] were not, since [t&#8217;a] is a productive verbal suffix and there is no suffix [-p&#8217;a]. All forms had a coda consonant in the first syllable, because in real words the suffix [-t&#8217;a] triggers deletion of a root vowel and thus forms a cluster with the final root consonant.</p>
<p>The test forms just described fell into one of four categories, based on the laryngeal restriction that was violated and the place of articulation of C<sub>2</sub> (as a stand-in for implied morphological complexity): plain-ejective-labial, plain-ejective-dental, ejective-ejective-labial, ejective-ejective-dental. There were ten tokens in each test category and 40 filler items, which had a plain stop in C<sub>2</sub> and a plain stop, fricative or sonorant in C<sub>1</sub>, for a total of 80 items. The test items are given in Table <xref ref-type="table" rid="T8">8</xref>.</p>
<table-wrap id="T8">
<label>Table 8</label>
<caption>
<p>Stimuli for Experiment 2 (not including fillers).</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;">ej-ej-labial</th>
<th align="left" style="background-color:#f3f3f4;">ej-ej-dental</th>
<th align="left" style="background-color:#f3f3f4;">pl-ej-labial</th>
<th align="left" style="background-color:#f3f3f4;">pl-ej-dental</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">k&#8217;asp&#8217;a+&#626;a</td>
<td align="left">k&#8217;as+t&#8217;a+&#626;a</td>
<td align="left">kasp&#8217;a+&#626;a</td>
<td align="left">kas+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">k&#8217;isp&#8217;a+&#626;a</td>
<td align="left">k&#8217;is+t&#8217;a+&#626;a</td>
<td align="left">kisp&#8217;a+&#626;a</td>
<td align="left">kis+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">k&#8217;a&#654;p&#8217;a+&#626;a</td>
<td align="left">k&#8217;a&#654;+t&#8217;a+&#626;a</td>
<td align="left">ka&#654;p&#8217;a+&#626;a</td>
<td align="left">ka&#654;+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">k&#8217;u&#654;p&#8217;a+&#626;a</td>
<td align="left">k&#8217;u&#654;+t&#8217;a+&#626;a</td>
<td align="left">ku&#654;p&#8217;a+&#626;a</td>
<td align="left">ku&#654;+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">&#679;&#8217;imp&#8217;a+&#626;a</td>
<td align="left">&#679;&#8217;in+t&#8217;a+&#626;a</td>
<td align="left">&#679;imp&#8217;a+&#626;a</td>
<td align="left">&#679;in+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">&#679;&#8217;amp&#8217;a+&#626;a</td>
<td align="left">&#679;&#8217;an+t&#8217;a+&#626;a</td>
<td align="left">&#679;amp&#8217;a+&#626;a</td>
<td align="left">&#679;an+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">&#679;&#8217;u&#654;p&#8217;a+&#626;a</td>
<td align="left">&#679;&#8217;u&#654;+t&#8217;a+&#626;a</td>
<td align="left">&#679;u&#654;p&#8217;a+&#626;a</td>
<td align="left">&#679;u&#654;+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">&#679;&#8217;i&#654;p&#8217;a+&#626;a</td>
<td align="left">&#679;&#8217;i&#654;+t&#8217;a+&#626;a</td>
<td align="left">&#679;i&#654;p&#8217;a+&#626;a</td>
<td align="left">&#679;i&#654;+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">q&#8217;asp&#8217;a+&#626;a</td>
<td align="left">q&#8217;as+t&#8217;a+&#626;a</td>
<td align="left">qasp&#8217;a+&#626;a</td>
<td align="left">qas+t&#8217;a+&#626;a</td>
</tr>
<tr>
<td align="left">q&#8217;o&#654;p&#8217;a+&#626;a</td>
<td align="left">q&#8217;o&#654;+t&#8217;a+&#626;a</td>
<td align="left">qo&#654;p&#8217;a+&#626;a</td>
<td align="left">qo&#654;+t&#8217;a+&#626;a</td>
</tr>
</table>
</table-wrap>
<p>The stimuli were made from recordings of a native Aymara speaker producing phonotactically legal nonce words, using the same splicing method and normalization as described for Experiment 1.</p>
<p><italic>Procedure</italic>&#160;&#160;&#160;&#160;The procedure was identical to Experiment 1.</p>
<p><italic>Analysis</italic>&#160;&#160;&#160;&#160;&#160;&#160;&#160;The analysis was identical to Experiment 1.</p>
</sec>
<sec>
<title>3.2.3 Results</title>
<p>Results from three participants were removed from further analysis because they had a low accuracy rate on filler items (15%, 32% and 64%), showing that they struggled with the task as a whole. The following discussion reflects the results of the remaining 17 participants.</p>
<p><italic>Accuracy</italic> as shown in Figure <xref ref-type="fig" rid="F2">2</xref>, accurate repetition of forms with [t&#8217;a], those that are plausibly polymorphemic, was higher than those with [p&#8217;a], where the plain-ejective or ejective-ejective combination must be interpreted as monomorphemic (74% vs. 35%). This distinction held for both ejective-ejective and plain-ejective combinations.</p>
<fig id="F2">
<label>Figure 2</label>
<caption>
<p>Accuracy in Experiment 2. Open circles indicate an individual participant&#8217;s performance; boxplots show summary statistics across all participants.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65073/"/>
</fig>
<p>A binomial, linear mixed model was fit to accuracy with predictors of place of articulation (labial or dental), violation type (ejective-ejective or plain-ejective), and their interaction, along with by-participant random slopes for place and type (a model with the interaction as a random by-participant slope failed to converge) and a random intercept for participant. The model finds a main effect of place, with lower accuracy on labial forms than dental forms (<italic>&#946;</italic> = &#8211;1.37, SE = 0.31, t = &#8211;4.39, p &lt; 0.0001). The model also revealed a significant interaction between place and violation type (<italic>&#946;</italic> = &#8211;1.52, SE = 0.62, t = &#8211;2.46, p = 0.02). While overall accuracy only marginally differs between ejective-ejective and plain-ejective violations (<italic>&#946;</italic> = 1.11, SE = 0.59, t = 1.89, p = 0.06), the direction of the effect differs depending on place. For labials, accuracy on ejective-ejective is slightly higher than on plain-ejective (38.5% vs. 32%), while for dentals, plain-ejective accuracy is higher than ejective-ejective accuracy (82% vs. 67%).</p>
<p><italic>Errors</italic> In Experiment 2, a high number of errors involved changing a labial ejective to a dental ejective, thereby repairing the phonotactic violation by separating the combining stops with a morpheme boundary. For example, in [k&#8217;asp&#8217;a&#626;a], the pair of ejectives must be interpreted as cooccurring within a root (*[k&#8217;asp&#8217;a+&#626;a]), while in [k&#8217;ast&#8217;a&#626;a] the pair of ejectives may be interpreted as cooccurring across a morpheme boundary (*[k&#8217;ast&#8217;a+&#626;a] and [k&#8217;as+t&#8217;a+&#626;a] are both possible parses). The frequency of different errors is summarized in Tables <xref ref-type="table" rid="T9">9</xref> and <xref ref-type="table" rid="T10">10</xref>. In each table, errors in the top section remove the phonotactic violation entirely. In the second section, errors change the plausible morphological structure of the pseudo-verb by changing a labial to a dental, and in the third section errors do not repair the violation. Virtually all errors are much more frequent for labial forms than dental forms, and place errors are only attested for labial forms.</p>
<table-wrap id="T9">
<label>Table 9</label>
<caption>
<p>Errors on cooccurrence stimuli. Top: errors that remove the phonotactic violation, middle: errors that change place and morphological structure, bottom: non-repair errors. Percentages indicate the total rate of errors out of all responses (e.g., 60.5% labial stimuli were produced with errors, and 38.5% were produced without errors).</p>
</caption>
<table>
<tr>
<th align="left" valign="top" style="background-color:#f3f3f4;" rowspan="3"></th>
<th align="left" valign="top" style="background-color:#f3f3f4;" rowspan="3">error</th>
<th align="left" valign="top" style="background-color:#f3f3f4;" rowspan="3">example</th>
<th align="center" valign="top" style="background-color:#f3f3f4;" colspan="2">labial</th>
<th align="center" valign="top" style="background-color:#f3f3f4;" colspan="2">dental</th>
</tr>
<tr>
<th colspan="4"><hr/></th>
</tr>
<tr>
<th align="center" style="background-color:#f3f3f4;">%</th>
<th align="center" style="background-color:#f3f3f4;">(n)</th>
<th align="center" style="background-color:#f3f3f4;">%</th>
<th align="center" style="background-color:#f3f3f4;">(n)</th>
</tr>
<tr>
<td colspan="7"><hr/></td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">lar. repairs</td>
<td align="left">C<sub>2</sub> de-ej.</td>
<td align="left">k&#8217;asp&#8217;a&#626;a --&gt; k&#8217;aspa&#626;a</td>
<td align="right">18.5</td>
<td align="right">(31)</td>
<td align="right">7</td>
<td align="right">(12)</td>
</tr>
<tr>
<td align="left">C<sub>1</sub> &amp; C<sub>2</sub> de-ej.</td>
<td align="left">k&#8217;asp&#8217;a&#626;a --&gt; kaspa&#626;a</td>
<td align="right">4</td>
<td align="right">(6)</td>
<td align="right">1</td>
<td align="right">(2)</td>
</tr>
<tr>
<td align="left">C<sub>1</sub> change</td>
<td align="left">k&#8217;asp&#8217;a&#626;a --&gt; asp&#8217;a&#626;a</td>
<td align="right">0.5</td>
<td align="right">(1)</td>
<td align="right">2</td>
<td align="right">(3)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">pl. repairs</td>
<td align="left">C<sub>1</sub> de-ej &amp; place</td>
<td align="left">k&#8217;asp&#8217;a&#626;a --&gt; kast&#8217;a&#626;a</td>
<td align="right">12</td>
<td align="right">(20)</td>
<td align="right">0</td>
<td align="right">(0)</td>
</tr>
<tr>
<td align="left">place change</td>
<td align="left">k&#8217;asp&#8217;a&#626;a --&gt; k&#8217;ast&#8217;a&#626;a</td>
<td align="right">18.5</td>
<td align="right">(31)</td>
<td align="right">0</td>
<td align="right">(0)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">non-repairs</td>
<td align="left">C<sub>1</sub> de-ej.</td>
<td align="left">k&#8217;asp&#8217;a&#626;a --&gt; kasp&#8217;a&#626;a</td>
<td align="right">7</td>
<td align="right">(12)</td>
<td align="right">23</td>
<td align="right">(39)</td>
</tr>
<tr>
<td align="left"><italic>total</italic></td>
<td align="left"></td>
<td align="right"><italic>60.5</italic></td>
<td align="right"></td>
<td align="right"><italic>33</italic></td>
<td align="right"></td>
</tr>
</table>
</table-wrap>
<table-wrap id="T10">
<label>Table 10</label>
<caption>
<p>Errors on ordering stimuli. Top: errors that remove the phonotactic violation, middle: errors that change place and morphological structure, bottom: non-repair errors. Percentages indicate the total rate of errors out of all responses (e.g., 18% of dental stimuli were produced with errors, and 82% were produced without errors).</p>
</caption>
<table>
<tr>
<th align="left" valign="top" style="background-color:#f3f3f4;" rowspan="3"></th>
<th align="left" valign="top" style="background-color:#f3f3f4;" rowspan="3">error</th>
<th align="left" valign="top" style="background-color:#f3f3f4;" rowspan="3">example</th>
<th align="center" valign="top" style="background-color:#f3f3f4;" colspan="2">labial</th>
<th align="center" valign="top" style="background-color:#f3f3f4;" colspan="2">dental</th>
</tr>
<tr>
<th colspan="4"><hr/></th>
</tr>
<tr>
<th align="center" valign="top" style="background-color:#f3f3f4;">%</th>
<th align="center" valign="top" style="background-color:#f3f3f4;">(n)</th>
<th align="center" valign="top" style="background-color:#f3f3f4;">%</th>
<th align="center" valign="top" style="background-color:#f3f3f4;">(n)</th>
</tr>
<tr>
<td colspan="7"><hr/></td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">lar. repairs</td>
<td align="left">C<sub>2</sub> de-ej.</td>
<td align="left">kasp&#8217;a&#626;a --&gt; kaspa&#626;a</td>
<td align="right">15</td>
<td align="right">(28)</td>
<td align="right">10</td>
<td align="right">(17)</td>
</tr>
<tr>
<td align="left">ej. reassociation</td>
<td align="left">kasp&#8217;a&#626;a --&gt; k&#8217;aspa&#626;a</td>
<td align="right">6</td>
<td align="right">(11)</td>
<td align="right">0.5</td>
<td align="right">(1)</td>
</tr>
<tr>
<td align="left">C<sub>1</sub> change</td>
<td align="left">kasp&#8217;a&#626;a --&gt; asp&#8217;a&#626;a</td>
<td align="right">0</td>
<td align="right">(0)</td>
<td align="right">0.5</td>
<td align="right">(2)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">pl. repairs</td>
<td align="left">place change</td>
<td align="left">kasp&#8217;a&#626;a --&gt; kast&#8217;a&#626;a</td>
<td align="right">40</td>
<td align="right">(74)</td>
<td align="right">0</td>
<td align="right">(0)</td>
</tr>
<tr>
<td align="left">ej. double &amp; pl.</td>
<td align="left">kasp&#8217;a&#626;a --&gt; k&#8217;ast&#8217;a&#626;a</td>
<td align="right">2</td>
<td align="right">(4)</td>
<td align="right">0</td>
<td align="right">(0)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">non-repairs</td>
<td align="left">ejective double</td>
<td align="left">kasp&#8217;a&#626;a --&gt; k&#8217;asp&#8217;a&#626;a</td>
<td align="right">5</td>
<td align="right">(9)</td>
<td align="right">7</td>
<td align="right">(11)</td>
</tr>
<tr>
<td align="left"><italic>total</italic></td>
<td align="left"></td>
<td align="right"><italic>68</italic></td>
<td align="right"></td>
<td align="right"><italic>18</italic></td>
<td align="right"></td>
</tr>
</table>
</table-wrap>
</sec>
<sec>
<title>3.2.4 Discussion</title>
<p>Experiment 2 shows that speakers&#8217; treatment of plain-ejective and ejective-ejective combinations is sensitive to the inferred morphological structure of the forms in which they occur. When these consonant combinations must be interpreted as being tautomorphemic, there is a higher error rate than when the combination may be interpreted as heteromorphemic.</p>
</sec>
</sec>
<sec>
<title>3.3 Summary and comparison of Experiments 1 and 2</title>
<p>Together, Experiments 1 and 2 show that Aymara speakers have learned morphologically sensitive restrictions on ejective-ejective and plain-ejective combinations. Comparison between the two experiments supports a role for both morphological structure and phonetic category in the place of articulation effect found in both studies.</p>
<p>Error rates on forms with dental ejectives were lower than error rates on forms with non-dental ejectives in both experiments. This effect is at least partly an effect of phonetic category: dental ejectives are much more common as the second consonant in a plain-ejective or ejective-ejective sequence than other places of articulation (97% of such combinations in our corpus have [t&#8217;] as C<sub>2</sub>), because of the frequency of the suffix [-t&#8217;a]. We can thus conclude that speakers are sensitive to the different distribution of [t&#8217;] compared to other ejectives.</p>
<p>To see if there is also a contribution of morphological structure, we need to consider the increased accuracy on dental ejective forms between Experiments 1 and 2. In Experiment 1, all forms, including those with dental ejectives, have a monomorphemic structure. Some of the dental ejective stimuli in Experiment 1 cannot be decomposed into suffixes (e.g., [qat&#8217;i] &#8211; there is no suffix [t&#8217;i]). Stimuli with [t&#8217;a], such as [kat&#8217;a] and [nut&#8217;a], are unlikely to be analyzed as root-suffix because the suffix [t&#8217;a] always attaches to bases that are at least CVC (e.g., [taw+t&#8217;a+&#626;a] &#8216;about to row&#8217;).<xref ref-type="fn" rid="n5">5</xref> The other reason for doubting that the dental ejective stimuli in Experiment 1 are morphologically decomposed is that [t&#8217;a] is an aspectual suffix that is usually followed by tense and person marking. There are only six words in the corpus where [-t&#8217;a] is the last and only suffix, and none of them have the shape CVt&#8217;a.</p>
<p>On the other hand, in Experiment 2, forms with a dental ejective have a polymorphemic structure, since they contain the two verbal suffixes [t&#8217;a] and [&#626;a]. To test for an effect of morphological structure, an additional, post-hoc statistical model was run. Responses to ejective-ejective and plain-ejective stimuli from the two experiments were pooled and a binomial linear mixed model was fitted. The dependent variable was accuracy, and the independent variables were experiment (Experiment 1 or Experiment 2) and place (dental or not). There was a random intercept for participant and a random by-participant slope for place (a model with a random slope for experiment failed to converge). The model found a significant interaction between place and experiment (<italic>&#946;</italic> = 1.14, SE = 0.28, t = 4.15, p &lt; 0.0001), revealing that the effect of place differs between the two experiments. The difference between accuracy on dental and non-dental forms is larger in Experiment 2 (74% vs. 35%, a 39 point difference), where stimuli have a polymorphemic structure, than in Experiment 1 (52% vs. 37%, a 15 point difference) where stimuli have a monomorphemic structure. While the experiments were not originally designed to be compared in this way, the raw differences in accuracy and the high significance level in the statistical test are supportive of an effect of morphological structure above and beyond place of articulation.</p>
<p>The types of errors between the two experiments also differ, and further show the importance of morphological structure to repetition accuracy. While place of articulation errors are quite rare in Experiment 1, these errors are very frequent in Experiment 2. Errors that map a labial ejective to a dental make up just 4.5% of errors in Experiment 1 but 73% in Experiment 2. Participants&#8217; responses in Experiment 2 are thus tracking the polymorphemic, pseudo-verb structure of the stimuli, skewing responses to create a phonotactically legal form by changing place of articulation and thus morphological structure.</p>
<p>In sum, the experiments here provide behavioral evidence that the laryngeal restrictions on ejectives and their interaction with morphology are part of speakers&#8217; synchronic grammars.</p>
</sec>
</sec>
<sec>
<title>4 Learning simulations</title>
<p>Having presented the corpus and behavioral evidence for the restrictions in Aymara, we move on to modeling the learning of these patterns from our corpus. Our model starts with a parsed corpus of word forms, notices the need for nonlocal projections, and induces a set of constraints that capture the local and nonlocal phonology of the language. We show how our model fits our experimental results as well as a broader range of restrictions in the literature on Aymara.</p>
<sec>
<title>4.1 A model of learning projections from baseline phonology</title>
<sec>
<title>4.1.1 A brief description of the learner</title>
<p>In this section, we present a brief description of our learning model, which is described in more detail in Gouskova and Gallagher (<xref ref-type="bibr" rid="B24">to appear</xref>). The implementation of the model is available on GitHub at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/gouskova/inductive_projection_learner">https://github.com/gouskova/inductive_projection_learner</ext-link>.</p>
<p>The model builds on Hayes and Wilson&#8217;s (<xref ref-type="bibr" rid="B28">2008</xref>) UCLA Phonotactic Learner (UCLAPL). The first stage of the learning procedure constructs a phonotactic grammar based on a list of phonological words and features describing each segment (see (8)). The model proceeds to construct a set of constraints against unattested and underattested sequences (see (9)). These constraints are formulated in terms of natural classes, and they are given weights using a Maximum Entropy procedure, which seeks to maximize the probability of the learning data. The resulting grammar can be used to assign harmony scores to test words, so that its fit to the data can be compared against, e.g., experimental data from human speakers (see <xref ref-type="bibr" rid="B23">Goldwater and Johnson 2003</xref>; <xref ref-type="bibr" rid="B15">Daland et al. 2011</xref>; <xref ref-type="bibr" rid="B9">Berent et al. 2012</xref>; <xref ref-type="bibr" rid="B29">Hayes and White 2013</xref>; <xref ref-type="bibr" rid="B44">Wilson and Gallagher 2018</xref>).</p>
</sec>
<sec>
<title>4.1.2 Building projections from cues in the baseline grammar</title>
<p>The original version of the learner has the capability to posit constraints on autosegmental projections provided by the analyst: for example, Hayes and Wilson (<xref ref-type="bibr" rid="B28">2008</xref>) demonstrate that their learner can find constraints enforcing vowel harmony in Shona verbs when it is given a projection that includes only vowels; the learner can also capture the stress pattern of Wargamay when given the appropriate projections for segments that bear primary and secondary stress. We exploit this capability in our extension to the learner, which finds projections and/or modifies the training data automatically when certain cues are present in the baseline grammar. For the simulations reported in this paper, two such cues are instrumental:</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(6)</td>
<td>Segmental placeholder trigrams: constraints of the form *X-any_segment-Y, where X and Y are part of a natural class Z. When the learner finds such trigrams, it adds projection Z to its search space of constraints.</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(7)</td>
<td>Morpheme-boundary trigrams: constraints of the form *X-non_morpheme_boundary-Y, where X and Y are part of a natural class Z. When the learner finds such trigrams, it adds a projection Z to its search space of constraints.</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Hayes and Wilson&#8217;s learner automatically adds the feature [&#177;word boundary] to every feature set in order to capture word edge phonotactics. The non-boundary segments of each language are then automatically part of the largest natural class, [&#8211;word boundary]. Since Hayes and Wilson&#8217;s learner has a bias toward broad natural classes, the learner will identify the constraints that refer to this class relatively early compared to other trigrams, provided the language offers support for them. Intuitively, the presence of constraints whose middle segment can be any of the segments in the language, *X-any_segment-Y, is a cue to the learner that the segments to either side of &#8220;any segment&#8221; interact nonlocally. The logic for non-morpheme-boundary trigrams is similar. If the grammar includes a constraint *X-non_morpheme_boundary-Y, this tells us that X and Y are permitted across a morpheme boundary but not across any intervening segment, so *X-any_segment-Y holds tautomorphemically.</p>
<p>Our extension of the learner uses constraints of this type to posit a projection that includes whichever natural class is the smallest class including both X and Y. For example, if the learner finds that [l] and [r] cannot occur across any segment (as in an idealized version of Latin, <xref ref-type="bibr" rid="B40">Steriade 1987</xref>; <xref ref-type="bibr" rid="B13">Cser 2010</xref>), it will posit a projection of liquids. More specifically, in the simulation of Quechua reported in Gouskova and Gallagher (<xref ref-type="bibr" rid="B24">to appear</xref>), the learner&#8217;s baseline includes the constraint *[&#8211;cont,-son][&#8211;wb][+cg], or *stop-any_seg-ejective. The smallest class that includes all stops and ejectives is the natural class of stops, so the stop projection is added to the grammar and searched for constraints. This is schematically illustrated in (8)&#8211;(11).</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(8)</td>
<td colspan="2"><italic>Input to the learner</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td>training data: {&lt;#pata#&gt;, &lt;#p&#8217;ata#&gt;, &lt;#p&#688;ata#&gt;, &lt;#t&#8217;ampa#&gt;, &lt;#map&#8217;a#&gt;, &lt;#lama#&gt;, &#8230;}</td>
</tr>
<tr>
<td>&#160;</td>
<td>b.</td>
<td>feature set</td>
</tr>
<tr>
<td>&#160;</td>
<td colspan="2"><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65078/"/></td>
</tr>
</tbody>
</table>
</table-wrap>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(9)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>Stage 1 output of the learner: baseline grammar</italic></p></list-item>
<list-item><p><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65079/"/></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(10)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>Learner posits a projection for the smallest natural class that includes [&#8211;son, &#8211;cont] and [+cg]</italic>:</p></list-item>
<list-item><p><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65080/"/></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<list list-type="gloss">
<list-item>
<list list-type="wordfirst">
<list-item><p>(11)</p></list-item>
</list>
</list-item>
<list-item>
<list list-type="sentence-gloss">
<list-item>
<list list-type="final-sentence">
<list-item><p><italic>Stage 2 output of the learner: projection grammar</italic></p></list-item>
<list-item><p><inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65081/"/></p></list-item>
</list>
</list-item>
</list>
</list-item>
</list>
<p>Occasionally, the learner identifies more than one placeholder constraint, in which case we allow it to search each of the resulting projections for constraints if the natural classes they entail are distinct.</p>
<p>If the learner is trained on morphologically parsed data, it may detect constraints of the form *X-non_morpheme_boundary-Y. In our feature sets, all segments are [&#8211;morpheme boundary], but word edges and morpheme boundaries are [+morpheme boundary]. If the learner posits a constraint whose middle gram is [&#8211;morpheme boundary], this means that segmental trigrams of the form *X-any_segment-Y are underattested, but the trigram X-morpheme_boundary-Y is attested often enough to exclude it from the formulation of the constraint. This situation will arise in a language like Aymara that has few stop-any_segment-ejective trigrams but a fair number of stop-morpheme_boundary-ejective trigrams. The phonotactic restriction is cancelled in heteromorphemic contexts &#8211; the occurrence of an ejective in close proximity to a stop is a boundary signal (<italic>Grenzsignal</italic>) in the sense of Trubetzkoy (<xref ref-type="bibr" rid="B43">1939</xref>). Constraints of this type also cue our learner to construct a projection. The only difference is that when the learner is looking at morphologically parsed words, the morpheme boundary symbol will also be present on all projections.<xref ref-type="fn" rid="n6">6</xref> If morpheme boundaries can be present on a projection, they will separate the segments that would otherwise form a bigram, as shown in (12). For compactness, we will write morpheme boundaries as &#8220;+&#8221; rather than [+morpheme boundary], and the non-morpheme boundary class will be [&#8211;mb].</p>
<table-wrap>
<table content-type="example">
<tbody>
<tr>
<td>(12)</td>
<td colspan="5"><italic>Projections with morpheme boundaries</italic></td>
</tr>
<tr>
<td>&#160;</td>
<td>a.</td>
<td>Projection</td>
<td>b.</td>
<td>What is visible</td>
<td>&#160;</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>baseline/default</td>
<td>&#160;</td>
<td>t a m p&#8217; a</td>
<td>p a n + t&#8217; a</td>
</tr>
<tr>
<td>&#160;</td>
<td>&#160;</td>
<td>[&#8211;son, &#8211;cont]</td>
<td>&#160;</td>
<td>t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;p&#8217;</td>
<td>p&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;+ t&#8217;</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec>
<title>4.2 Parameters manipulated by the analyst</title>
<p>There are several parameters that affect the learner&#8217;s ability to find generalizations. First, the segmental features determine whether the learner can group the segments into the right natural classes. The learner is sensitive to the size of the classes, as well as their overall number (see <xref ref-type="bibr" rid="B28">Hayes and Wilson 2008</xref>, <xref ref-type="bibr" rid="B24">Gouskova and Gallagher to appear</xref>). Since we were primarily interested in laryngeal restrictions, we selected a feature set that uses mostly privative features (specifically, [plain], [cg] and [sg]). Hayes and Wilson&#8217;s learner favors constraints whose natural classes mention as few features as possible, so privative features allow certain classes to be picked out more easily; in the Aymara case, this means that plain stops can be picked out as [+plain] instead of as [&#8211;cg, &#8211;sg]. See Section 4.3 for the full list.</p>
<p>The other parameters have an effect on the number of constraints induced, the length of segmental strings they scope over, and how closely the learner fits the grammar to the data. We were generous in the number of constraints we allowed the learner to discover, since this learner stops when it cannot identify any constraints that pass the selection criterion. The length of constraint strings on the segmental projection ranged from 1 (as in *[+cg]) to 3 (as in *[&#8211;syllabic][&#8211;syllabic][&#8211;syllabic], &#8220;no CCC clusters&#8221;); the length of constraint strings on higher projections ranged from 2 to 3. In addition to these two fairly simple parameters, there are several parameters that affect the fit of the model in various ways.</p>
<p>The first parameter is gain (<xref ref-type="bibr" rid="B16">Della Pietra et al. 1997</xref>; <xref ref-type="bibr" rid="B44">Wilson and Gallagher 2018</xref>; <xref ref-type="bibr" rid="B24">Gouskova and Gallagher to appear</xref>), which replaces the O/E threshold criterion in the version of the learner described in Hayes and Wilson (<xref ref-type="bibr" rid="B28">2008</xref>). The gain of a constraint is proportional to the reduction in the Kullback-Leibler divergence between the current grammar and the grammar with C added, and the weights of all the other constraints unchanged. Put differently, gain is higher when the probability distributions in the learning data are closer to those generated by the grammar if the constraint were added. A constraint can only be added if its gain exceeds the threshold; the higher the gain, the harder it is to add new constraints. We have found moreover that gain can be lower when the training data sets are small and each datum is relatively informative, but larger data sets yield more sensible grammars when gain is higher.</p>
<p>The second parameter we manipulated is gamma. This parameter affects how the objective function of the learner is calculated each time a constraint is added &#8211; it scales the harmony score relative to the negative log probability, with the effect of increasing the impact of constraint violations by individual candidates. Increasing gamma makes it less likely that constraints with very low weights will be learned (NB: both <xref ref-type="bibr" rid="B16">Della Pietra et al. 1997</xref> and <xref ref-type="bibr" rid="B44">Wilson and Gallagher 2018</xref> use &#611; to refer to the gain threshold; this is distinct from the gamma parameter). There are additional parameters that can be manipulated, such as the Laplace regularizer &#955;, whose function is to penalize constraints with large weights (<xref ref-type="bibr" rid="B44">Wilson and Gallagher 2018: 615</xref>). We set &#955; to a small constant 0.00001.</p>
</sec>
<sec>
<title>4.3 Learning data and features</title>
<p>The corpus we used was based on the Aymara wordlist on the An Cr&#250;bad&#225;n project website (<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://crubadan.org">http://crubadan.org</ext-link>), described in Section 2. We created two versions of the corpus: an unsegmented list of phonological words (transcribed on the basis of the transparent orthography), and a segmented list with morpheme boundaries. Recall that we only used those words in the An Cr&#250;bad&#225;n list that contained the roots that also occur in the de Lucca (<xref ref-type="bibr" rid="B17">1987</xref>) dictionary. Since morpheme boundaries add to the overall length of each string, we had to filter the segmented word list to exclude words above a certain length.<xref ref-type="fn" rid="n7">7</xref> This left us with 46,164 words each in the segmented and unsegmented lists.</p>
<p>The feature set we used for all simulations is shown in Table <xref ref-type="table" rid="T11">11</xref>. In addition to the phonemes of Aymara, the feature set contains the morpheme boundary &#8220;+&#8221;, the word boundary, and a special &#8220;copy&#8221; segment X, which has just one privative feature, [+copy]. As explained in Gouskova and Gallagher (<xref ref-type="bibr" rid="B24">to appear</xref>), this copy notation is necessary because the learner does not implement algebraic notation in its constraint language (<xref ref-type="bibr" rid="B9">Berent et al. 2012</xref>). Thus, to allow the learner to distinguish between the allowed identical pairs of ejectives and the disallowed non-identical ones (recall Section 2.1), we transcribe words such as [t&#8217;ant&#8217;a] as [t&#8217;anXa].</p>
<table-wrap id="T11">
<label>Table 11</label>
<caption>
<p>Features of Aymara segments for computational simulations.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="center" style="background-color:#f3f3f4;">long</th>
<th align="center" style="background-color:#f3f3f4;">syll</th>
<th align="center" style="background-color:#f3f3f4;">son</th>
<th align="center" style="background-color:#f3f3f4;">cont</th>
<th align="center" style="background-color:#f3f3f4;">cg</th>
<th align="center" style="background-color:#f3f3f4;">sg</th>
<th align="center" style="background-color:#f3f3f4;">plain</th>
<th align="center" style="background-color:#f3f3f4;">lab</th>
<th align="center" style="background-color:#f3f3f4;">dent</th>
<th align="center" style="background-color:#f3f3f4;">pal</th>
<th align="center" style="background-color:#f3f3f4;">vel</th>
<th align="center" style="background-color:#f3f3f4;">uv</th>
<th align="center" style="background-color:#f3f3f4;">rhot</th>
<th align="center" style="background-color:#f3f3f4;">lat</th>
<th align="center" style="background-color:#f3f3f4;">nas</th>
<th align="center" style="background-color:#f3f3f4;">lo</th>
<th align="center" style="background-color:#f3f3f4;">bk</th>
<th align="center" style="background-color:#f3f3f4;">hi</th>
</tr>
<tr>
<td colspan="19"><hr/></td>
</tr>
<tr>
<td align="left">p</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">t</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#679;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">k</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">q</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">p&#8217;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">t&#8217;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#679;&#8217;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">k&#8217;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">q&#8217;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">p&#688;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">t&#688;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#679;&#688;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">k&#688;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">q&#688;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">s</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#643;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#967;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">h</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">m</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">n</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#626;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">r</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">l</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">&#654;</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">j</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
</tr>
<tr>
<td align="left">w</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
</tr>
<tr>
<td align="left">i</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
</tr>
<tr>
<td align="left">u</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
</tr>
<tr>
<td align="left">a</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
</tr>
<tr>
<td align="left">e</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
</tr>
<tr>
<td align="left">o</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
</tr>
<tr>
<td align="left">i&#720;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
</tr>
<tr>
<td align="left">u&#720;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">+</td>
</tr>
<tr>
<td align="left">a&#720;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
</tr>
<tr>
<td align="left">e&#720;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
<td align="right">&#8211;</td>
</tr>
<tr>
<td align="left">o&#720;</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">+</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#8211;</td>
<td align="right">+</td>
<td align="right">&#8211;</td>
</tr>
</table>
</table-wrap>
</sec>
<sec>
<title>4.4 The simulations</title>
<sec>
<title>4.4.1 Baseline grammar trained on a corpus of segmented words</title>
<p>We trained the learner on the corpus of morphologically segmented words, since we expected it to be able to identify the generalizations about tautomorphemic stops when it was supplied with the crucial information about morpheme boundaries. The simulation we report here had a gain of 400 and gamma of 150, and the grammar included 120 constraints. The baseline grammar contains three morphological trigram constraints (see Table <xref ref-type="table" rid="T12">12</xref>), corresponding to the three restricted laryngeal combinations. Plain-ejective, plain-aspirate and ejective-ejective combinations are found across a morpheme boundary but are underattested across any intervening segment.</p>
<table-wrap id="T12">
<label>Table 12</label>
<caption>
<p>Morpheme-boundary constraints in the baseline grammar trained on segmented words.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2">Constraint</th>
<th align="center" style="background-color:#f3f3f4;">Weight</th>
<th align="left" style="background-color:#f3f3f4;">Sequences penalized</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">*[+plain][&#8211;mb][+cg]</td>
<td align="right">13.214</td>
<td align="left">[p t &#679; k q]-seg-[p&#8217; t&#8217; &#679;&#8217; k&#8217; q&#8217;]</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">*[+plain][&#8211;mb][&#8211;cont, +sg]</td>
<td align="right">13.039</td>
<td align="left">[p t &#679; k q]-seg-[p<sup>h</sup> t<sup>h</sup> &#679;<sup>h</sup> k<sup>h</sup> q<sup>h</sup>]</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">*[+cg][&#8211;mb][+cg]</td>
<td align="right">12.886</td>
<td align="left">[p&#8217; t&#8217; &#679;&#8217; k&#8217; q&#8217;]-seg-[p&#8217; t&#8217; &#679;&#8217; k&#8217; q&#8217;]</td>
</tr>
</table>
</table-wrap>
<p>These three constraints act as cues to the creation of two projections, shown in Table <xref ref-type="table" rid="T13">13</xref>. The oral stop projection is motivated by constraints (a) and (b), since the smallest natural class grouping over plain stops and aspirates or over plain stops and ejectives is the class of all oral stops. The ejective projection is motivated by constraint (c).</p>
<table-wrap id="T13">
<label>Table 13</label>
<caption>
<p>Projections posited from morpheme-boundary trigrams.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;">Projection</th>
<th align="left" style="background-color:#f3f3f4;">Defining features</th>
<th align="left" style="background-color:#f3f3f4;">What is visible</th>
</tr>
<tr>
<td colspan="3"><hr/></td>
</tr>
<tr>
<td align="left">oral stops</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">p t k &#679; q, p&#8217; t&#8217; k&#8217; &#679;&#8217; q&#8217;, p&#688; t&#688; k&#688; &#679;&#688; q&#688;, +</td>
</tr>
<tr>
<td align="left">ejectives</td>
<td align="left">[+cg]</td>
<td align="left">p&#8217; t&#8217; k&#8217; &#679;&#8217; q&#8217;, +</td>
</tr>
</table>
</table-wrap>
</sec>
<sec>
<title>4.4.2 The full grammar with projections</title>
<p>In the next step of the learning procedure, the same training data set is revisited with the two projections identified in the baseline simulation, in addition to the default projection. Learning in this stage starts from scratch &#8211; it does not include any of the constraints learned on the default projection in the first stage of learning.</p>
<p>Table <xref ref-type="table" rid="T14">14</xref> shows all of the constraints that the learner posited on the two nonlocal projections, grouped by projection. Within each projection, the constraints are shown in the order they were added to the grammar &#8211; following Hayes and Wilson&#8217;s heuristics, bigram constraints are considered first, since there are fewer of them than trigrams. There are several constraints on the classes of ejectives and aspirates, as well as constraints on individual ejective and aspirate segments. There are also some constraints on place of articulation combinations. The constraints that capture the restrictions on plain-ejective and ejective-ejective combinations are given in bold.</p>
<table-wrap id="T14">
<label>Table 14</label>
<caption>
<p>Part of the final grammar induced from training on the full segmented corpus of Aymara: Constraints discovered on the nonlocal projections.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2">Projection</th>
<th align="left" style="background-color:#f3f3f4;">Constraint</th>
<th align="center" style="background-color:#f3f3f4;">Weight</th>
<th align="left" style="background-color:#f3f3f4;">Sequences penalized</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left"><bold>a.</bold></td>
<td align="left"><bold>[+cg]</bold></td>
<td align="left"><bold>*[&#8211;syll][&#8211;syll]</bold></td>
<td align="right"><bold>4.899</bold></td>
<td align="left"><bold>ejective&#8230;ejective</bold></td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">[+cg]</td>
<td align="left">*[&#8211;wb][+palatal][+wb]</td>
<td align="right">11.938</td>
<td align="left">ejective/+&#8230;[&#679;&#8217;]&#8230;#</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">[+cg]</td>
<td align="left">*[&#8211;wb][+uvular][+wb]</td>
<td align="right">11.324</td>
<td align="left">ejective/+&#8230;[q&#8217;]&#8230;#</td>
</tr>
<tr>
<td align="left"><bold>d.</bold></td>
<td align="left"><bold>[&#8211;son, &#8211;cont]</bold></td>
<td align="left"><bold>*[&#8211;wb][+cg,+labial]</bold></td>
<td align="right"><bold>12.694</bold></td>
<td align="left"><bold>stop/+&#8230;[p&#8217;]</bold></td>
</tr>
<tr>
<td align="left"><bold>e.</bold></td>
<td align="left"><bold>[&#8211;son, &#8211;cont]</bold></td>
<td align="left"><bold>*[&#8211;syll][+cg,+velar]</bold></td>
<td align="right"><bold>11.629</bold></td>
<td align="left"><bold>stop&#8230;[k&#8217;]</bold></td>
</tr>
<tr>
<td align="left"><bold>f.</bold></td>
<td align="left"><bold>[&#8211;son, &#8211;cont]</bold></td>
<td align="left"><bold>*[&#8211;syll][+cg,+uvular]</bold></td>
<td align="right"><bold>11.274</bold></td>
<td align="left"><bold>stop&#8230;[q&#8217;]</bold></td>
</tr>
<tr>
<td align="left">g.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+dental][+palatal]</td>
<td align="right">12.616</td>
<td align="left">[t t&#8217; t&#688;]&#8230;[&#679; &#679;&#8217; &#679;&#688;]</td>
</tr>
<tr>
<td align="left">h.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+velar][+uvular]</td>
<td align="right">11.673</td>
<td align="left">[k k&#8217; k&#688;]&#8230;[q q&#8217; q&#688;]</td>
</tr>
<tr>
<td align="left">i.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+plain,+labial][+sg]</td>
<td align="right">12.352</td>
<td align="left">[p]&#8230;aspirate</td>
</tr>
<tr>
<td align="left">j.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+plain,+uvular][+sg]</td>
<td align="right">11.298</td>
<td align="left">[q]&#8230;aspirate</td>
</tr>
<tr>
<td align="left">k.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+plain,+dental][+sg]</td>
<td align="right">11.768</td>
<td align="left">[t]&#8230;aspirate</td>
</tr>
<tr>
<td align="left">l.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+uvular][+velar]</td>
<td align="right">5.629</td>
<td align="left">[q q&#8217; q&#688;]&#8230;[k k&#8217; k&#688;]</td>
</tr>
<tr>
<td align="left">m.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[&#8211;wb,+mb][+sg,+palatal]</td>
<td align="right">12.242</td>
<td align="left">+&#8230;[&#679;&#688;]</td>
</tr>
<tr>
<td align="left">n.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[&#8211;wb][&#8211;syll][+sg]</td>
<td align="right">11.366</td>
<td align="left">stop/+&#8230;stop&#8230;aspirate</td>
</tr>
<tr>
<td align="left"><bold>o.</bold></td>
<td align="left"><bold>[&#8211;son, &#8211;cont]</bold></td>
<td align="left"><bold>*[+plain][+cg][+wb]</bold></td>
<td align="right"><bold>6.351</bold></td>
<td align="left"><bold>plain stop&#8230;ejective&#8230;#</bold></td>
</tr>
<tr>
<td align="left">p.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[&#8211;syll][&#8211;syll][+uvular]</td>
<td align="right">11.594</td>
<td align="left">stop&#8230;stop&#8230;[q q&#8217; q&#688;]</td>
</tr>
<tr>
<td align="left">q.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[&#8211;syll][&#8211;syll][+palatal]</td>
<td align="right">11.628</td>
<td align="left">stop&#8230;stop&#8230;[&#679; &#679;&#8217; &#679;&#688;]</td>
</tr>
<tr>
<td align="left">r.</td>
<td align="left">[&#8211;son, &#8211;cont]</td>
<td align="left">*[+cg][&#8211;syll][&#8211;syll]</td>
<td align="right">11.832</td>
<td align="left">ejective&#8230;stop&#8230;stop</td>
</tr>
</table>
</table-wrap>
<p>To assess how these constraints capture the phonological restrictions in Aymara, we test how the grammar rates the nonce words from the repetition studies in Section 3 and then go on to look at a broader set of structures.</p>
</sec>
<sec>
<title>4.4.3 Testing the model against the experimental results</title>
<p>The grammar makes many of the same distinctions among nonce words that Aymara speakers made in the repetition experiment. The correlations between repetition accuracy and the score assigned by the grammar are plotted in Figures <xref ref-type="fig" rid="F3">3</xref> and <xref ref-type="fig" rid="F4">4</xref>.</p>
<fig id="F3">
<label>Figure 3</label>
<caption>
<p>Harmony scores for stimuli from repetition experiment 1, assigned by the final grammar trained on segmented word corpus. Each point in the plot is labeled according to stimulus type: &#8220;CT&#8221; is control, &#8220;PE&#8221; is plain-ejective, &#8220;EE&#8221; is ejective-ejective.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65074/"/>
</fig>
<fig id="F4">
<label>Figure 4</label>
<caption>
<p>Harmony scores for stimuli from repetition experiment 1, assigned by the final grammar trained on segmented word corpus. Each point in the plot is labeled according to stimulus type: &#8220;EL&#8221; is ejective-ejective-labial, &#8220;ED&#8221; is ejective-ejective-dental, &#8220;PL&#8221; is plain-ejective-labial and &#8220;PD&#8221; is plain-ejective-dental.</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65075/"/>
</fig>
<p>Figure <xref ref-type="fig" rid="F3">3</xref> pools accuracy averages across participants for each nonce word in Experiment 1; the regression line shows the overall correlation between repetition accuracy and harmony scores (the shaded region is the 95% confidence interval). Each data point in the plot represents an average accuracy score for a specific word in the experiment, labeled according to the type of stimulus: &#8220;CT&#8221; is control, &#8220;PE&#8221; is plain-ejective, &#8220;EE&#8221; is ejective-ejective. Like participants in the experiment, the model distinguishes control words from forms with the restricted plain-ejective and ejective-ejective combinations. The model assigns a score of &#8211;6 to all control words, compared to an average score of &#8211;22 to plain-ejective or ejective-ejective forms.</p>
<p>The model also reflects the distinction between dental ejectives and other ejectives. The grammar includes constraints that penalize all ejective-ejective or plain-ejective combinations, but it also includes specific constraints on stop&#8230;[p&#8217;], stop&#8230;[q&#8217;], and stop&#8230;[k&#8217;] combinations, which further penalize forms with non-dental ejectives. Plain-ejective and ejective-ejective combinations with a dental ejective receive a higher average score of &#8211;12, while these same laryngeal combinations with other medial ejectives receive a lower average score of &#8211;25.</p>
<p>In the experiment, participants&#8217; accuracy was somewhat higher for ejective-ejective forms (50%) than plain-ejective forms (32%) &#8211; a distinction that is not reflected in the model (both categories have an average score of &#8211;22). This difference was small, though significant, in the behavioral data, but the difference between plain-ejective and ejective-ejective forms was inconsistent across the two experiments, so we cannot draw firm conclusions about differences in grammaticality between these two types of combinations. The correlation between the model&#8217;s harmony scores and the Aymara speakers&#8217; average accuracy in repeating the words is fairly high (Kendall&#8217;s <italic>&#964;</italic> = 0.68, Spearman&#8217;s <italic>&#961;</italic> = 0.83). We report non-parametric correlations only because the harmony scores are not normally distributed.</p>
<p>Figure <xref ref-type="fig" rid="F4">4</xref> plots the nonce words tested in Experiment 2. The stimulus types are labeled as follows: ED: ejective-ejective-dental, EL: ejective-ejective-labial, PD: plain-ejective-dental, PL: plain-ejective-labial. The model assigns higher scores to forms with a dental ejective (&#8211;12 on average for both ejective-dental and plain-dental combinations), which are polymorphemic, than to monomorphemic forms with a labial ejective (&#8211;33 for ejective-labial and &#8211;28 for plain-labial combinations), reflecting the main effect of place of articulation in the experiment (overall accuracy on forms with dental ejectives: 74%; overall accuracy on forms with labial ejectives: 35%). Again, the small, inconsistent differences between ejective-ejective and plain-ejective combinations observed in the experiments are not reflected in the model. The correlations between harmony scores assigned by the model and the Aymara speakers&#8217; average accuracy in this experiment are <italic>&#964;</italic> = 0.52, <italic>&#961;</italic> = 0.72.</p>
<p>The model captures the distinction between labial and dental forms in two ways. First, forms with labials have a tautomorphemic plain-ejective or ejective-ejective combination, so they violate the constraints *[+plain][+cg][+wb] on the stop projection or *[&#8211;syllabic][&#8211;syllabic] on the ejective projection. Forms with dental ejectives, on the other hand, have a morpheme boundary intervening between the stops and thus escape a violation of these constraints. Second, the model contains a constraint on labial ejectives that are not the first stop in the word (*[&#8211;wb][+labial, +cg]), but no such constraint on dental ejectives, resulting in lower scores for forms with a labial ejective than a dental ejective, regardless of morphological structure.</p>
<p>As discussed in Section 3.3, the independent roles of place of articulation and morphological structure can be teased apart by comparing the results of Experiment 1 and Experiment 2. In the behavioral data, there were fewer errors on forms with dental ejectives in both experiments, but the difference between forms with dental and non-dental ejectives was larger in Experiment 2, where morphological structure was also at play. Our model shows the same qualitative pattern, though the difference is slight: the difference in scores between dental and non-dental forms is 13 points for Experiment 1 but 18 points for Experiment 2.</p>
</sec>
<sec>
<title>4.4.4 Beyond the experimental data: Evaluating the full range of stop combinations</title>
<p>We have focused thus far on just two underattested structures in Aymara: plain-ejective and ejective-ejective combinations. There are other underattested stop combinations in Aymara, however, and in this section we look at how our model reflects these other restrictions.</p>
<p>As discussed in Section 2, plain-aspirate sequences are underattested. Our model includes four constraints that penalize plain-aspirate combinations. There are constraints against three specific plain stops followed by the class of aspirates &#8211; [p]-aspirate, [t]-aspirate and [q]-aspirate &#8211; as well as the more general constraint [&#8211;wb][&#8211;syllabic][+sg]. This latter constraint penalizes all stop-aspirate combinations, not just plain-aspirate ones, but only when they are preceded by another stop or a morpheme boundary.</p>
<p>To assess the grammar, we constructed a small set of targeted test words. All test words had a CaCa structure, and contained two stops. Table <xref ref-type="table" rid="T15">15</xref> shows the scores that our model assigns to words with several different laryngeal configurations: four (a)&#8211;(d) that are described as unrestricted in the literature (ejective-aspirate and aspirate-ejective combinations are discussed below), and three that are restricted (e)&#8211;(g). The scores reported in this and subsequent tables are averaged over all CaCa forms that contained the relevant combination of consonants, and did not violate any other restriction described in this section (to allow assessment of each restriction individually). For example, the score for &#8220;pl-pl&#8221; was averaged over 22 of the 25 hypothetical CaCa forms with two plain stops ([papa], [pata], [pa&#679;a], [paka], [paqa], [tapa], etc.); combinations of dental-palatal ([ta&#679;a]), uvular-velar ([qaka]) or velar-uvular ([kaqa]) were excluded since these are subject to additional restrictions, discussed below.</p>
<table-wrap id="T15">
<label>Table 15</label>
<caption>
<p>Harmony scores assigned by our final model to a small test set of nonce words assessing laryngeal restrictions.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2">Lar. combo</th>
<th align="left" style="background-color:#f3f3f4;">Description in lit</th>
<th align="left" style="background-color:#f3f3f4;">Score</th>
<th align="left " style="background-color:#f3f3f4;">Constraints violated</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">pl-pl</td>
<td align="left" valign="middle" rowspan="4">unrestricted</td>
<td align="right">&#8211;6</td>
<td align="left">none</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">ej-pl</td>
<td align="right">&#8211;6</td>
<td align="left">none</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">asp-pl</td>
<td align="right">&#8211;6</td>
<td align="left">none</td>
</tr>
<tr>
<td align="left">d.</td>
<td align="left">asp-asp</td>
<td align="right">&#8211;8</td>
<td align="left">*[-high][+sg, +palatal] (default)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="4">e.</td>
<td align="left" valign="middle" rowspan="4">pl-ej</td>
<td align="left" valign="middle" rowspan="11">restricted</td>
<td align="right" valign="middle" rowspan="4">&#8211;20</td>
<td align="left">*[&#8211;wb][+cg, +labial] (stop)</td>
</tr>
<tr>
<td align="left">*[&#8211;syll][+cg, +velar] (stop)</td>
</tr>
<tr>
<td align="left">*[&#8211;syll][+cg, +uvular] (stop)</td>
</tr>
<tr>
<td align="left">*[+plain][+cg][+wb] (stop)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">f.</td>
<td align="left" valign="middle" rowspan="3">pl-asp</td>
<td align="right" valign="middle" rowspan="3">&#8211;16</td>
<td align="left">*[+plain, +labial][+sg] (stop)</td>
</tr>
<tr>
<td align="left">*[+plain, +uvular][+sg] (stop)</td>
</tr>
<tr>
<td align="left">*[+plain, +uvular][+dental] (stop)</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="4">g.</td>
<td align="left" valign="middle" rowspan="4">ej-ej</td>
<td align="right" valign="middle" rowspan="4">&#8211;23</td>
<td align="left">*[&#8211;syll][&#8211;syll] (ejective)</td>
</tr>
<tr>
<td align="left">*[&#8211;wb][+cg, +labial] (stop)</td>
</tr>
<tr>
<td align="left">*[&#8211;syll][+cg, +velar] (stop)</td>
</tr>
<tr>
<td align="left">*[&#8211;syll][+cg, +uvular] (stop)</td>
</tr>
</table>
</table-wrap>
<p>Table <xref ref-type="table" rid="T15">15</xref> shows that the grammar clearly distinguishes between unrestricted laryngeal combinations, which receive a high score of &#8211;6, and the restricted combinations, which receive lower scores (note that &#8211;6 is the highest score given to any 4 segment form by the grammar because the model includes a constraint against any segment *[], which penalizes longer words). The plain-ejective and ejective-ejective combinations receive lower scores than plain-aspirate combinations because not all plain-aspirate combinations are penalized by the grammar. The model only penalizes forms with three of the plain stops, [p t q], followed by aspirates, but assigns a score of &#8211;6, comparable to unrestricted combinations, to forms with [&#679;]-aspirate or [k]-aspirate combinations. In this case, the model dances around the exceptions to the restriction, positing several more specific but more accurate constraints on individual plain-aspirate combinations as opposed to a single more general but less accurate constraint covering all plain-aspirate combinations.</p>
<p>Aymara shows place cooccurrence restrictions as well as laryngeal cooccurrence restrictions. Combinations of dorsals (uvulars and velars) and combinations of coronals (dentals and palatals) are infrequent. Our model includes the constraints *[dental][palatal], *[velar][uvular] and *[uvular][velar] on the stop projection, which penalize three of the four possible combinations. The distinction between dental-palatal and palatal-dental combinations seems reasonable: there are 84 palatal-dental sequences in our corpus compared to 24 dental-palatal sequences. For dorsals, there is just 1 uvular-velar combination and 0 velar-uvular combinations in our corpus, so both constraints are warranted. Table <xref ref-type="table" rid="T16">16</xref> shows that dental-palatal, velar-uvular and uvular-velar combinations receive low scores, while palatal-dental combinations receive higher scores comparable to the unrestricted laryngeal combinations.</p>
<table-wrap id="T16">
<label>Table 16</label>
<caption>
<p>Harmony scores assigned by our final model to a small test set of nonce words assessing place cooccurrence restrictions.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2">Combination</th>
<th align="center" style="background-color:#f3f3f4;">Harmony score</th>
<th align="left" style="background-color:#f3f3f4;">Constraints violated</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">dental-palatal</td>
<td align="right">&#8211;24</td>
<td align="left">*[+dental][+palatal] (stop)</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">palatal-dental</td>
<td align="right">&#8211;6</td>
<td align="left">none</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">velar-uvular</td>
<td align="right">&#8211;18</td>
<td align="left">*[+velar][+uvular] (stop)</td>
</tr>
<tr>
<td align="left">d.</td>
<td align="left">uvular-velar</td>
<td align="right">&#8211;14</td>
<td align="left">*[+uvular][+velar] (stop)</td>
</tr>
</table>
</table-wrap>
<p>A set of quite complicated restrictions applies to ejective-aspirate and aspirate-ejective pairs. While both of these laryngeal combinations are attested, not all individual combinations of segments occur. The attested combinations of ejectives and aspirates in our corpus are shown in Table <xref ref-type="table" rid="T17">17</xref>. As described in MacEachern (<xref ref-type="bibr" rid="B31">1997</xref>), which stop is ejective and which is aspirate is predictable based on place of articulation. If the initial consonant is a labial or a uvular, it will be aspirated (see (j)&#8211;(l)); otherwise, it is ejective (see (a)&#8211;(h). In uvular-labial pairs, the uvular is ejective (see (i)). Any combination not shown in the table is unattested.</p>
<table-wrap id="T17">
<label>Table 17</label>
<caption>
<p>Counts of Ejective-aspirate and aspirate-ejective combinations.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2">ejective-aspirate</th>
<th align="center" style="background-color:#f3f3f4;">observed</th>
<th align="left" style="background-color:#f3f3f4;" colspan="2">ejective-aspirate</th>
<th align="center" style="background-color:#f3f3f4;">observed</th>
<th align="left" style="background-color:#f3f3f4;" colspan="2">aspirate-ejective</th>
<th align="center" style="background-color:#f3f3f4;">observed</th>
</tr>
<tr>
<td colspan="9"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">t&#8217;&#8230;p&#688;</td>
<td align="right">2</td>
<td align="left">f.</td>
<td align="left">&#679;&#8217;&#8230;q&#688;</td>
<td align="right">5</td>
<td align="left">j.</td>
<td align="left">p&#688;&#8230;t&#8217;</td>
<td align="right">7</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">t&#8217;&#8230;k&#688;</td>
<td align="right">11</td>
<td align="left">g.</td>
<td align="left">k&#8217;&#8230;p&#688;</td>
<td align="right">15</td>
<td align="left">k.</td>
<td align="left">p&#688;&#8230;&#679;&#8217;</td>
<td align="right">24</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">t&#8217;&#8230;q&#688;</td>
<td align="right">334</td>
<td align="left">h.</td>
<td align="left">k&#8217;&#8230;t&#688;</td>
<td align="right">11</td>
<td align="left">l.</td>
<td align="left">q&#688;&#8230;t&#8217;</td>
<td align="right">7</td>
</tr>
<tr>
<td align="left">d.</td>
<td align="left">&#679;&#8217;&#8230;p&#688;</td>
<td align="right">29</td>
<td align="left">i.</td>
<td align="left">q&#8217;&#8230;p&#688;</td>
<td align="right">13</td>
<td align="right" colspan="3"></td>
</tr>
<tr>
<td align="left">e.</td>
<td align="left">&#679;&#8217;&#8230;k&#688;</td>
<td align="right">56</td>
<td align="left" colspan="3"></td>
<td align="right" colspan="3"></td>
</tr>
<tr>
<td align="left" colspan="2"><italic>Total</italic></td>
<td align="right"></td>
<td align="left" colspan="2"></td>
<td align="right"><italic>476</italic></td>
<td align="left" colspan="2"><italic>Total</italic></td>
<td align="right"><italic>38</italic></td>
</tr>
</table>
</table-wrap>
<p>Table <xref ref-type="table" rid="T18">18</xref> shows the harmony scores assigned to nonce words with attested and unattested combinations of ejectives and aspirates. The model correctly distinguishes between attested and unattested aspirate-ejective sequences, via the three place specific constraints (*[&#8211;wb][+labial, +cg], *[&#8211;wb][+uvular, +cg] and *[&#8211;wb][+velar, +cg]) which penalize non-initial labial, velar and uvular ejectives that are preceded by another stop. The grammar doesn&#8217;t include any constraints on ejective-aspirate sequences, and attested and unattested combinations are only weakly distinguished by the model. This difference arises from an orthogonal bigram constraint against [aeo][&#679;&#688;] sequences.</p>
<table-wrap id="T18">
<label>Table 18</label>
<caption>
<p>Harmony scores assigned by our final model to nonce words for aspirate/ejective combinations broken down by place.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="left" style="background-color:#f3f3f4;">Combination</th>
<th align="center" style="background-color:#f3f3f4;" >Harmony score</th>
<th align="left" style="background-color:#f3f3f4;">Constraints violated</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">aspirate-ejective, attested</td>
<td align="right">&#8211;6</td>
<td align="left">none</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">b.</td>
<td align="left" valign="middle" rowspan="3">aspirate-ejective, unattested</td>
<td align="right" valign="middle" rowspan="3">&#8211;16</td>
<td align="left">*[&#8211;wb][+labial, +cg] (stop)</td>
</tr>
<tr>
<td align="left">*[&#8211;wb][+uvular, +cg] (stop)</td>
</tr>
<tr>
<td align="left">*[&#8211;wb][+velar, +cg] (stop)</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">ejective-aspirate, attested</td>
<td align="right">&#8211;7</td>
<td align="left">*[-high][+sg, +palatal] (default)</td>
</tr>
<tr>
<td align="left">d.</td>
<td align="left">ejective-aspirate, unattested</td>
<td align="right">&#8211;10</td>
<td align="left">*[-high][+sg, +palatal] (default)</td>
</tr>
</table>
</table-wrap>
<p>Hayes and Wilson&#8217;s model has a preference for more general constraints, stated over larger natural classes. To completely match the distribution of stops in the language, the model would have to include many constraints on individual segmental combinations. While the grammar does include several constraints that refer to classes of a single segment, in other cases such specific constraints are not learned. We leave it to future experimental work to identify what generalizations Aymara speakers have learned about ejective-aspirate and aspirate-ejective combinations, and how the model might need to be modified to match speaker behavior.</p>
<p>Finally, pairs of segments that differ only in laryngeal features (e.g., [t&#8217;&#8230;t&#688;], [p&#8217;&#8230;p], [k&#688;&#8230;k], etc.) are also reported to be restricted and are nearly absent in our corpus. Our model does not include constraints on any of these combinations, again because such constraints would refer to individual segments and the learner typically does not learn such constraints.</p>
</sec>
</sec>
<sec>
<title>4.5 An unsegmented corpus</title>
<p>To further establish the role of morphological information in the success of our model, we ran learning simulations on the same word corpus, but with morpheme boundary markers removed. Recall from Table <xref ref-type="table" rid="T3">3</xref> that there are many instances of plain-aspirate, plain-ejective and ejective-ejective combinations in the corpus as a whole, but they are mostly found across a morpheme boundary. We tested whether these sequences were frequent enough to obscure the restrictions if morpheme boundaries are not represented. To start, we look at the number of plain-ejective, plain-aspirate and ejective-ejective combinations that occur in a baseline trigram in the unparsed data set in Table <xref ref-type="table" rid="T19">19</xref>; for comparison, we repeat the numbers for tautomorphemic and heterormorphemic sequences in the parsed data set from Table <xref ref-type="table" rid="T3">3</xref>. For all three restricted combinations, there are exceptions in the unparsed data set, and plain-aspirate combinations in particular are quite frequent. It is worth noting that the heteromorphemic trigram combinations in the parsed data set are actually bigrams in the unparsed data set (for example, [ati+<bold>p+t&#8217;</bold>a+&#626;] in the parsed data set appears as [atipt&#8217;a&#626;] in the unparsed data set), so these forms do not introduce exceptions at the trigram level in the unparsed data set. Instead, many of the exceptions that we see in the unparsed data are actually tetra- or penta-grams in the parsed data set that appear as trigrams once morpheme boundaries are removed. For example, the form [huk&#8217;am<bold>pst&#8217;</bold>a&#626;] appears in the unparsed data set but is [huk&#8217;a+m+<bold>p+s+t&#8217;</bold>a+&#626;] in the parsed data set.</p>
<table-wrap id="T19">
<label>Table 19</label>
<caption>
<p>Number of observed combinations appearing in a trigram configuration in the parsed (tautomorphemic and heteromorphemic columns) and unparsed data sets, for the three restricted stop combinations.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="center" style="background-color:#f3f3f4;">tautomorphemic</th>
<th align="center" style="background-color:#f3f3f4;">heteromorphemic</th>
<th align="center" style="background-color:#f3f3f4;">unparsed</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">plain-aspirate</td>
<td align="right">0</td>
<td align="right">149</td>
<td align="right">434</td>
</tr>
<tr>
<td align="left">plain-ejective</td>
<td align="right">0</td>
<td align="right">659</td>
<td align="right">17</td>
</tr>
<tr>
<td align="left">ejective-ejective (non-identical)</td>
<td align="right">0</td>
<td align="right">1</td>
<td align="right">5</td>
</tr>
</table>
</table-wrap>
<p>We then turned to examine whether the learner found placeholder trigram constraints when trained on the unparsed corpus (we call this the &#8220;Induced Unparsed Model&#8221;, in contrast to the model we presented in Section 4.4.3).<xref ref-type="fn" rid="n8">8</xref> We tested a range of gain and gamma combinations, and compared the placeholder trigrams found in the baseline model when trained on the parsed and unparsed data sets. All runs of the learner were asked to find a maximum of 200 constraints. We report on a representative sample of the numerous combinations we tried in Table <xref ref-type="table" rid="T20">20</xref>. Morpheme boundary trigram constraints corresponding to at least two of the three restricted combinations are found for the parsed data under almost all settings, and settings with a higher gain or gamma allow the model to detect all three restrictions in the baseline grammar. For the unparsed data, placeholder trigram constraints are only found with a very low gamma of 1. Even with gamma this low, the Induced Unparsed model only finds a single placeholder trigram on ejectives; this model never finds a placeholder trigram corresponding to the plain-aspirate restriction and may miss the ejective-ejective restriction as well.</p>
<table-wrap id="T20">
<label>Table 20</label>
<caption>
<p>Morpheme-boundary trigram constraints found at various settings in the parsed and the unparsed versions of the Aymara corpus.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;">gain</th>
<th align="center" style="background-color:#f3f3f4;">gamma</th>
<th align="left" style="background-color:#f3f3f4;">parsed corpus</th>
<th align="left" style="background-color:#f3f3f4;">unparsed corpus</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">100</td>
<td align="right">1</td>
<td align="left"><italic>none</italic></td>
<td align="left">*[+plain][][+cg]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">400</td>
<td align="right" valign="middle" rowspan="2">1</td>
<td align="left">*[+plain][&#8211;mb][&#8211;continuant,+sg]</td>
<td align="left" valign="middle" rowspan="2">*[+plain][][+cg]</td>
</tr>
<tr>
<td align="left">*[&#8211;sonorant, &#8211;continuant][&#8211;mb][+cg]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">800</td>
<td align="right" valign="middle" rowspan="2">1</td>
<td align="left">*[+plain][&#8211;mb][&#8211;continuant,+sg]</td>
<td align="left" valign="middle" rowspan="2">*[&#8211;son, &#8211;cont][][+cg]</td>
</tr>
<tr>
<td valign="top" align="left">*[&#8211;sonorant, &#8211;continuant][&#8211;mb][+cg]</td>
</tr>
<tr>
<td valign="middle" align="left" rowspan="2">100</td>
<td valign="middle" align="right" rowspan="2">50</td>
<td valign="top" align="left">*[+plain][&#8211;mb][+cg]</td>
<td valign="middle" align="left" rowspan="2"><italic>none</italic></td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+sg, &#8211;cont]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">400</td>
<td align="right" valign="middle" rowspan="2">50</td>
<td align="left">*[+plain][&#8211;mb][+cg]</td>
<td align="left" valign="middle" rowspan="2"><italic>none</italic></td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+sg, &#8211;cont]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">800</td>
<td align="right" valign="middle" rowspan="3">50</td>
<td align="left">*[+cg][&#8211;mb][+cg]</td>
<td align="left" valign="middle" rowspan="3"><italic>none</italic></td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+cg]</td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+sg, &#8211;cont]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">100</td>
<td align="right" valign="middle" rowspan="3">150</td>
<td align="left">*[+cg][&#8211;mb][+cg]</td>
<td align="left" valign="middle" rowspan="3"><italic>none</italic></td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+cg]</td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+sg, &#8211;cont]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">400</td>
<td align="right" valign="middle" rowspan="3">150</td>
<td align="left">*[+cg][&#8211;mb][+cg]</td>
<td align="left" valign="middle" rowspan="3"><italic>none</italic></td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+cg]</td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+sg, &#8211;cont]</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="3">800</td>
<td align="right" valign="middle" rowspan="3">150</td>
<td align="left">*[+cg][&#8211;mb][+cg]</td>
<td align="left" valign="middle" rowspan="3"><italic>none</italic></td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+cg]</td>
</tr>
<tr>
<td align="left">*[+plain][&#8211;mb][+sg, &#8211;cont]</td>
</tr>
</table>
</table-wrap>
<p>One of the grammars built after training on the unparsed corpus is investigated in more detail in Figure <xref ref-type="fig" rid="F5">5</xref> and Table <xref ref-type="table" rid="T21">21</xref>. This grammar&#8217;s gain is 500, and gamma = 1. The baseline grammar constraint *[+plain]-any_seg-[+cg] motivates the [&#8211;son, &#8211;cont] projection, on which all restrictions could be correctly stated. But the final grammar achieves a poor fit to the experimental data, as shown by the nearly vertical lines in Figure <xref ref-type="fig" rid="F5">5</xref>. The correlations for Experiment 1 are <italic>&#964;</italic> = 0.22 and <italic>&#961;</italic> = 0.35; the correlations for Experiment 2 are <italic>&#964;</italic> = 0.43 and <italic>&#961;</italic> = 0.60.</p>
<fig id="F5">
<label>Figure 5</label>
<caption>
<p>The Induced Unparsed Model: a grammar with induced projections built from the unsegmented corpus, tested on Aymara experimental data. Data points are labeled according to stimulus type; Exp. 1: &#8220;CT&#8221; is control, &#8220;PE&#8221; is plain-ejective, &#8220;EE&#8221; is ejective-ejective. Exp. 2: ED: ejective-ejective-dental, EL: ejective-ejective-labial, PD: plain-ejective-dental, PL: plain-ejective-labial).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65076/"/>
</fig>
<table-wrap id="T21">
<label>Table 21</label>
<caption>
<p>Constraints on the stop projection discovered after training on a corpus of unparsed words (Induced Unparsed Model).</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2">Constraint on &#8211;son, &#8211;cont projection</th>
<th align="center" style="background-color:#f3f3f4;">weight</th>
<th align="left" style="background-color:#f3f3f4;">violated by</th>
</tr>
<tr>
<td colspan="4"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">*[&#8211;wb][+cg]</td>
<td align="right">1.494</td>
<td align="left">stop&#8230;[p&#8217; t&#8217; k&#8217; q&#8217; &#679;&#8217;]</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">*[+cg][+wb]</td>
<td align="right">1.998</td>
<td align="left">[p&#8217; t&#8217; k&#8217; q&#8217; &#679;&#8217;] &#8230; #</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">*[+plain,+uvular][+wb]</td>
<td align="right">0.989</td>
<td align="left">[q q&#688; q&#8217;] &#8230; #</td>
</tr>
<tr>
<td align="left">d.</td>
<td align="left">*[+palatal][+wb]</td>
<td align="right">1.154</td>
<td align="left">[&#679; &#679;&#688; &#679;&#8217;] &#8230; #</td>
</tr>
<tr>
<td align="left">e.</td>
<td align="left">*[+cg,+labial][]</td>
<td align="right">2.068</td>
<td align="left">[p p&#688; p&#8217;]&#8230; stop</td>
</tr>
<tr>
<td align="left">f.</td>
<td align="left">*[+velar][+uvular]</td>
<td align="right">1.73</td>
<td align="left">[k k&#688; k&#8217;] &#8230; [q q&#688; q&#8217;]</td>
</tr>
<tr>
<td align="left">g.</td>
<td align="left">*[+plain][+sg,+uvular]</td>
<td align="right">4.31</td>
<td align="left">[p t k q &#679;] &#8230; [q&#688;]</td>
</tr>
<tr>
<td align="left">h.</td>
<td align="left">*[+cg,+uvular][&#8211;wb]</td>
<td align="right">0.551</td>
<td align="left">[q&#8217;] &#8230; stop</td>
</tr>
<tr>
<td align="left">i.</td>
<td align="left">*[+wb][+cg,+dental][+palatal]</td>
<td align="right">2.731</td>
<td align="left"># [t&#8217;] [&#679; &#679;&#688; &#679;&#8217;]</td>
</tr>
<tr>
<td align="left">j.</td>
<td align="left">*[+cg,+palatal][&#8211;wb]</td>
<td align="right">0.997</td>
<td align="left">[&#679;&#8217;] &#8230; stop</td>
</tr>
<tr>
<td align="left">k.</td>
<td align="left">*[+cg,+velar][][+wb]</td>
<td align="right">1.981</td>
<td align="left">[k&#8217;] &#8230; stop &#8230; #</td>
</tr>
</table>
</table-wrap>
<p>The Induced Unparsed Model&#8217;s lack of success at representing phonologically meaningful underattestations in the unparsed data becomes clear when we look at the constraints it posits on the stop projection (see Table <xref ref-type="table" rid="T21">21</xref>). Their low weights are due in part to the gamma setting; almost all of them are violated frequently by Aymara words. While these constraints penalize some restricted combinations, they don&#8217;t capture the full extent of the restrictions nor are their weights high enough to distinguish restricted from unrestricted structures.</p>
<p>With this low gamma setting, the Induced Unparsed Model does not succeed in distinguishing meaningful underattestations in the data, and instead learns many, low weighted constraints with numerous exceptions. Our more succesful Induced Parsed Model in Section 4.4.3 has higher gamma and gain (in addition to access to morheme boundaries), which makes the Induced Parsed Model more selective and a better fit to the phonological distinctions supported by traditional phonological analysis and experimental work with native speakers. Given these same settings (400 gain, 150 gamma), the Induced Unparsed Model&#8217;s grammar fails to include any placeholder trigram constraints from which it could posit nonlocal projections.</p>
<p>We also considered whether it was possible to capture phonological distinctions on the stop projection with unparsed data, when the learner was given a higher gain and gamma and we supplied the stop projection manually. This is the Manual Unparsed Model. As shown in Figure <xref ref-type="fig" rid="F6">6</xref>, this model&#8217;s grammar achieves a better fit to Aymara speakers&#8217; performance in the repetition experiments than the grammar in Figure <xref ref-type="fig" rid="F5">5</xref>. But there are interesting differences in the details.</p>
<fig id="F6">
<label>Figure 6</label>
<caption>
<p>The Manual Unparsed Model: a grammar built from the unsegmented corpus, tested on Aymara experimental data, with manually supplied projections. Data points are labeled according to stimulus type; Exp. 1: &#8220;CT&#8221; is control, &#8220;PE&#8221; is plain-ejective, &#8220;EE&#8221; is ejective-ejective. Exp. 2: ED: ejective-ejective-dental, EL: ejective-ejective-labial, PD: plain-ejective-dental, PL: plain-ejective-labial).</p>
</caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="/article/id/5136/file/65077/"/>
</fig>
<p>The Manual Unparsed Model, trained on unparsed data, captures the distinction between dental and labial ejectives via constraints on everything but dentals (in Table <xref ref-type="table" rid="T22">22</xref>, (c), (e)&#8211;(g), (k)&#8211;(m)). But this model fails to make a distinction between control stimuli such as [lap&#8217;a] (100% correct) and [p&#8217;it&#8217;a] (67% correct) &#8211; they all receive a harmony score of &#8211;6. This is because this grammar does not include a general constraint against ejectives in second position on the stop projection. Constraint (c) on the ejective position is too specific. This grammar is overfitting, learning overly specific constraints to accommodate the exceptions in the data.</p>
<table-wrap id="T22">
<label>Table 22</label>
<caption>
<p>Constraints on ejective and stop projections induced from a corpus without morpheme boundaries.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;"></th>
<th align="left" style="background-color:#f3f3f4;">Projection</th>
<th align="left" style="background-color:#f3f3f4;">Constraint</th>
<th align="center" style="background-color:#f3f3f4;">Weight</th>
<th align="left" style="background-color:#f3f3f4;">Seq&#8217;s penalized</th>
</tr>
<tr>
<td colspan="5"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">+cg</td>
<td align="left">*[&#8211;wb][+labial]</td>
<td align="right">8.06</td>
<td align="left">ejective&#8230;[p&#8217;]</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">+cg</td>
<td align="left">*[&#8211;wb][+palatal]</td>
<td align="right">5.531</td>
<td align="left">ejective&#8230;[&#679;&#8217;]</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">+cg</td>
<td align="left">*[+dental][&#8211;wb]</td>
<td align="right">14.559</td>
<td align="left">[t&#8217;]&#8230;ejective</td>
</tr>
<tr>
<td align="left">d.</td>
<td align="left">-son-cont</td>
<td align="left">*[][&#8211;wb,+mb]</td>
<td align="right">17.915</td>
<td align="left">stop &#8230; +</td>
</tr>
<tr>
<td align="left">e.</td>
<td align="left">-son-cont</td>
<td align="left">*[&#8211;wb][+cg,+uvular]</td>
<td align="right">6.023</td>
<td align="left">stop &#8230; [q&#8217;]</td>
</tr>
<tr>
<td align="left">f.</td>
<td align="left">-son-cont</td>
<td align="left">*[&#8211;wb][+cg,+velar]</td>
<td align="right">6.265</td>
<td align="left">stop &#8230; [k&#8217;]</td>
</tr>
<tr>
<td align="left">g.</td>
<td align="left">-son-cont</td>
<td align="left">*[&#8211;wb][+cg,+labial]</td>
<td align="right">6.942</td>
<td align="left">stop&#8230; [p&#8217;]</td>
</tr>
<tr>
<td align="left">h.</td>
<td align="left">-son-cont</td>
<td align="left">*[+plain][+sg,+uvular]</td>
<td align="right">4.974</td>
<td align="left">[p t k q &#679;]&#8230; [q&#688;]</td>
</tr>
<tr>
<td align="left">i.</td>
<td align="left">-son-cont</td>
<td align="left">*[+dental][+sg,+palatal]</td>
<td align="right">6.33</td>
<td align="left">[t t&#688; t&#8217;] &#8230; [&#679;&#688;]</td>
</tr>
<tr>
<td align="left">j.</td>
<td align="left">-son-cont</td>
<td align="left">*[+dental][+cg,+palatal]</td>
<td align="right">13.976</td>
<td align="left">[t t&#688; t&#8217;] &#8230; [&#679;&#8217;]</td>
</tr>
<tr>
<td align="left">k.</td>
<td align="left">-son-cont</td>
<td align="left">*[+plain,+labial][+cg,+palatal]</td>
<td align="right">13.507</td>
<td align="left">[p]&#8230;[&#679;&#8217;]</td>
</tr>
<tr>
<td align="left">l.</td>
<td align="left">-son-cont</td>
<td align="left">*[+palatal][+cg,+palatal]</td>
<td align="right">5.633</td>
<td align="left">[&#679; &#679;&#688; &#679;&#8217;]&#8230;[&#679;&#8217;]</td>
</tr>
<tr>
<td align="left">m.</td>
<td align="left">-son-cont</td>
<td align="left">*[+plain,+uvular][+cg,+palatal]</td>
<td align="right">12.905</td>
<td align="left">[q]&#8230;[&#679;&#8217;]</td>
</tr>
<tr>
<td align="left">n.</td>
<td align="left">-son-cont</td>
<td align="left">*[+cg,+labial][+sg]</td>
<td align="right">5.42</td>
<td align="left">[p&#8217;]&#8230;aspirate</td>
</tr>
<tr>
<td align="left">o.</td>
<td align="left">-son-cont</td>
<td align="left">*[+plain][+sg][+cg]</td>
<td align="right">13.224</td>
<td align="left">plain&#8230;aspirate &#8230;ejective</td>
</tr>
<tr>
<td align="left">p.</td>
<td align="left">-son-cont</td>
<td align="left">*[+plain][+sg][+sg]</td>
<td align="right">13.625</td>
<td align="left">plain&#8230;aspirate&#8230;aspirate</td>
</tr>
<tr>
<td align="left">q.</td>
<td align="left">-son-cont</td>
<td align="left">*[&#8211;wb][][+cg,+palatal]</td>
<td align="right">13.283</td>
<td align="left">stop&#8230;stop&#8230;[&#679;&#8217;]</td>
</tr>
<tr>
<td align="left">r.</td>
<td align="left">-son-cont</td>
<td align="left">*[+sg][+plain][+sg,+labial]</td>
<td align="right">12.555</td>
<td align="left">aspirate&#8230;plain&#8230;[p&#688;]</td>
</tr>
<tr>
<td align="left">s.</td>
<td align="left">-son-cont</td>
<td align="left">*[+sg][+cg][+sg]</td>
<td align="right">12.723</td>
<td align="left">aspirate&#8230;ejective&#8230;aspirate</td>
</tr>
</table>
</table-wrap>
<p>When it comes to Experiment 2, however, the Manual Unparsed Model&#8217;s fit to behavioral data is comparable to the model we reported in Section 4.4.3 (although the differences between &#8220;good&#8221; and &#8220;bad&#8221; forms are smaller in Experiment 2 &#8211; the opposite of the parsed grammar). The reasons for this have to do with the abundance of dental ejectives in Aymara suffixes; by positing place-specific constraints against non-dental ejectives in 2nd position on the stop projection, the model manages to approximate the same generalizations.</p>
<p>For a quantitative comparison, Table <xref ref-type="table" rid="T23">23</xref> summarizes the non-parametric correlations between the harmony scores each model assigns to the experimental stimuli and the averaged accuracy in the repetition experiments with Aymara speakers. Model (a), which uses parsed data both for inducing the projections and for the final grammar has the best correlations with Experiment 1, and the best correlations overall. Model (b), which includes no morphological information at all, achieves the lowest correlations across the board. The third model, (c), does worse on the first experiment and slightly better on the second experiment, but its overall correlations with behavioral data are not as good as the model in (a).</p>
<table-wrap id="T23">
<label>Table 23</label>
<caption>
<p>Correlations between harmony scores assigned by the three models and accuracy in repetition experiments with Aymara speakers.</p>
</caption>
<table>
<tr>
<th align="left" style="background-color:#f3f3f4;" colspan="2" rowspan="3"></th>
<th align="center" style="background-color:#f3f3f4;" colspan="2">Experiment 1</th>
<th align="center" style="background-color:#f3f3f4;" colspan="2">Experiment 2</th>
<th align="center" style="background-color:#f3f3f4;" colspan="2">Overall</th>
</tr>
<tr>
<th colspan="6"><hr/></th>
</tr>
<tr>
<th align="center" style="background-color:#f3f3f4;"><italic>&#964;</italic></th>
<th align="center" style="background-color:#f3f3f4;"><italic>&#961;</italic></th>
<th align="center" style="background-color:#f3f3f4;"><italic>&#964;</italic></th>
<th align="center" style="background-color:#f3f3f4;"><italic>&#961;</italic></th>
<th align="center" style="background-color:#f3f3f4;"><italic>&#964;</italic></th>
<th align="center" style="background-color:#f3f3f4;"><italic>&#961;</italic></th>
</tr>
<tr>
<td colspan="9"><hr/></td>
</tr>
<tr>
<td align="left">a.</td>
<td align="left">Induced Parsed Model (Figs. <xref ref-type="fig" rid="F3">3</xref>, <xref ref-type="fig" rid="F4">4</xref>)</td>
<td align="right">0.68</td>
<td align="right">0.83</td>
<td align="right">0.52</td>
<td align="right">0.72</td>
<td align="right">0.59</td>
<td align="right">0.77</td>
</tr>
<tr>
<td align="left">b.</td>
<td align="left">Induced Unparsed Model (Fig. <xref ref-type="fig" rid="F5">5</xref>)</td>
<td align="right">0.22</td>
<td align="right">0.35</td>
<td align="right">0.43</td>
<td align="right">0.60</td>
<td align="right">0.31</td>
<td align="right">0.42</td>
</tr>
<tr>
<td align="left">c.</td>
<td align="left">Manual Unparsed Model (Fig. <xref ref-type="fig" rid="F6">6</xref>)</td>
<td align="right">0.49</td>
<td align="right">0.67</td>
<td align="right">0.55</td>
<td align="right">0.74</td>
<td align="right">0.52</td>
<td align="right">0.69</td>
</tr>
</table>
</table-wrap>
<p>Zooming out from laryngeal cooccurrence restrictions, the Manual Unparsed Model is not quite right in other ways. Aymara morphemes obey an exceptionless constraint against CCC clusters (0 of them in the corpus), but such clusters are created by syncope at morpheme boundaries (recall Section 2.1). The right constraint to capture this would be *[&#8211;syll][&#8211;syll][&#8211;syll]. The Induced Parsed Model (a) contains such a constraint, and gives it a high weight of 14.506. The Manual Unparsed Model cannot motivate such a constraint &#8211; there are 3322 words in the Aymara corpus that have such clusters. What this model does instead is posit many specific constraints, sometimes with rather low weights, against various CCC clusters that it sees few examples of &#8211; e.g., *[&#8211;son, &#8211;cont][+labial][&#8211;cont], with a weight of 6.778. This is just one example of a morpheme structure constraint that a morphology-agnostic learner cannot capture.</p>
</sec>
</sec>
<sec>
<title>5 Discussion</title>
<sec>
<title>5.1 Morpheme boundaries and place of articulation</title>
<p>The experimental and modeling work show that the distinction between tautomorphemic and heteromorphemic stop-ejective sequences is likely both a direct effect of morphology and also an effect of place of articulation. Participants in Experiment 1 made slightly fewer errors on stimuli with dental ejectives than with other ejectives, reflecting the frequency of dental ejectives in non-initial position. This effect was exaggerated in Experiment 2, where the structure of nonce words favored a polymorphemic parse. Similarly, the constraints in the grammar are sensitive to both morphological structure and place of articulation. The model includes constraints on tautomorphemic but not hetermorphemic laryngeal combinations and it also includes more specific constraints on individual segments; due to their presence in suffixes, [t&#8217;] and [&#679;&#8217;] are more frequent than other ejectives in non-initial position, and the constraints in the grammar reflect this asymmetry.</p>
<p>One element of the experimental results not directly captured by the model is that while participants were more accurate on dental ejectives than labial ejectives in Experiment 2, they still made more errors on dental ejective forms than on filler items. In contrast, our model predicts forms like [k&#8217;ast&#8217;a&#626;a] to be fully grammatical. Errors on dental ejectives likely reflect the two available morphological parses for these forms. Since these were nonce words that weren&#8217;t associated with any meaning, speakers did not know for certain that [t&#8217;a] in these forms was the verbal suffix. There are also roots in the language that end in [t&#8217;a]. Errors in the repetition task are predicted if participants occasionaly parse a form like [k&#8217;ast&#8217;a&#626;a] as [k&#8217;ast&#8217;a+&#626;a], while accurate repetitions are expected if the [k&#8217;as+t&#8217;a+&#626;a] parse is hypothesized.</p>
</sec>
<sec>
<title>5.2 Phonotactic learning and segmentation</title>
<p>Our modeling simulations showed that while some pieces of the restrictions can be detected in an unsegmented corpus, morpheme boundaries are necessary to fully capture the patterns. This means that infants and children learning Aymara cannot have a complete phonotactic grammar until they have learned enough morphology to segment the speech stream. The prediction is that the trajectory of phonotactic awareness of laryngeal restrictions should be different in Aymara learners than in learners of a language where laryngeal restrictions are categorical and hold at the word level, like Quechua.</p>
<p>Even though languages do not mark every morpheme boundary phonotactically, speakers of languages such as Finnish and Dutch have been shown to use phonotactic knowledge to segment speech in experimental settings (<xref ref-type="bibr" rid="B41">Suomi et al. 1997</xref>; <xref ref-type="bibr" rid="B35">McQueen 1998</xref>). The model we presented makes a simplifying assumption that at some point, the learner examines a fully parsed corpus with boundaries. A more realistic approach would use phonotactics to deduce where boundaries are located (as in the StaGe model of <xref ref-type="bibr" rid="B1">Adriaans and Kager 2010</xref>). StaGe uses bigram probabilities to posit word boundaries. Aymara would lend itself to such an approach: ejectives and aspirates are most common root-/word-initially (recall Table <xref ref-type="table" rid="T2">2</xref>), so the distribution of plain-ejective bigrams would be a clue to boundaries even for a learner that does not yet have detailed morphological segmentation information. We leave for future work an implementation of a more complete model that deduces both where morpheme/word boundaries are and whether they lead to nonlocal projections.</p>
</sec>
<sec>
<title>5.3 Typological considerations: Syncope and morpheme boundary projections</title>
<p>In our baseline grammar for Aymara, the nonlocal restrictions on stop combinations are reflected in morpheme boundary trigrams like *[+plain][&#8211;mb][+sg, &#8211;continuant]. The pattern of syncope in Aymara is crucial to these constraints being found, because syncope allows stops to appear adjacent to one another across an intervening morpheme boundary.<xref ref-type="fn" rid="n9">9</xref> Without syncope, a hypothetical form like /lipa+t&#8217;a/ would be realized as [li<bold>pa+t&#8217;</bold>a] (as opposed to [li<bold>p+t&#8217;</bold>a]), with the interacting stops only appearing in a tetragram; syncope creates a consonant-morpheme_boundary-consonant trigram. In a language without syncope, the restrictions on stop combinations may still be observable in a baseline trigram, but they would be reflected in a placeholder trigram (in which the medial gram is &#8216;any segment&#8217;, e.g., *[+plain][][+sg, &#8211;continuant]) as opposed to in a morpheme boundary trigram. Under our proposal about how nonlocal projections are induced, both placeholder trigrams and morpheme boundary trigrams trigger the learner to add a nonlocal projection to their search space of constraints, so syncope should not be crucial to the learning of a nonlocal restriction.</p>
<p>A learner does not know in advance whether including morpheme boundaries on to projections will lead to better generalizations. We showed in Gouskova and Gallagher (<xref ref-type="bibr" rid="B24">to appear</xref>) that in languages like Quechua, for example, it is possible to discover the nonlocal interactions between stops from phonological words alone, and presumably Quechua learners acquire this knowledge before they are morphologically aware, since the restrictions are categorical within words. It is important in Quechua that the nonlocal projection include all stops but <italic>not</italic> include morpheme boundaries, since the relevant restrictions hold both within morphemes and across morpheme boundaries.</p>
<p>We hypothesize that in both Quechua-type languages and Aymara-type languages, learning starts on unparsed words, and if any projections are discovered, they include word but not morpheme boundaries. When words are morphologically segmented, the phonotactic grammar is reassessed, in case any generalizations were missed in the unparsed grammar. In a language like Aymara, we showed that the nonlocal laryngeal restrictions are not noticeable from the unparsed data, so a learner of Aymara would not be able to learn these restrictions until they had acquired some morphological structure, at which point nonlocal projections with the morpheme boundary symbol and constraints on this projection would be added to the grammar. In a language like Quechua, where similar laryngeal restrictions are not morphologically sensitive, the laryngeal restrictions should be learned earlier and represented on a nonlocal projection that does not include morpheme boundaries. When the Quechua learner returns to phonotactic learning with morphological information, the learner should not uncover any new placeholder trigrams on stops, since the distribution of stop combinations is already fully accounted for on the nonlocal stop projection without the morpheme boundary symbol. We verified that this is in fact how things work for Quechua. We first trained a baseline model on an unparsed corpus (about 10k words, described in <xref ref-type="bibr" rid="B24">Gouskova and Gallagher to appear</xref>), from which the leaner built a nonlocal stop projection. We then trained a model with this projection on the parsed data, and indeed, the learner found constraints on the nonlocal stop projection but did not learn any new placeholder trigrams on the default projection that would motivate adding the morpheme boundary symbol to the stop projection. We leave it to future work to examine in more detail how morphologically insensitive generalizations can be incorporated into a later stage of learning where morphological structure is represented.</p>
</sec>
<sec>
<title>5.4 Morphologically sensitive phonotactics and the subset problem</title>
<p>Patterns such as those of Aymara present two types of subset problem (<xref ref-type="bibr" rid="B3">Baker 1979</xref>; <xref ref-type="bibr" rid="B10">Bowerman 1988</xref>). First, phonotactic learning in general requires the learner to err on the side of assuming more restrictive grammars and to construct its own negative evidence. In order to posit these more restrictive grammars, an Optimality-Theoretic learner with innate constraints requires a bias to keep faithfulness constraints ranked low; the negative evidence comes from the theory&#8217;s Gen component (<xref ref-type="bibr" rid="B27">Hayes 2004</xref>; <xref ref-type="bibr" rid="B37">Prince and Tesar 2004</xref>). If the learner induces constraints from data instead, it constructs its own negative evidence by comparing the attested data to plausible phonotactic distributions generated at random (<xref ref-type="bibr" rid="B28">Hayes and Wilson 2008</xref>). But even such a learner will not notice nonlocal interactions &#8211; it must either be given the nonlocal representations a priori (as in Hayes and Wilson&#8217;s proposal), or it must be nudged in the direction of looking for nonlocal representations. In our proposal, the learner does a second pass of phonotactic learning in response to generalizations that it may have missed on the first pass, signaled by placeholder trigram constraints. This kind of bias is designed to alert the learner to the need for more restrictive constraints.</p>
<p>Second, if the learner assumes that the sequences allowed in words are also allowed in morphemes, then this morphologically agnostic learner will learn the superset grammar. For a language such as English, the superset learner hears words with [md] clusters and assumes, incorrectly, that such clusters are allowed anywhere, not just at morpheme boundaries. Our experiments suggest that Aymara speakers make the more conservative assumption. We suggest that in order to learn the right level of generalization, an Aymara learner has to revisit phonotactic learning once morphological information is available. We implemented this by supplying morpheme boundaries and using sequences at boundaries as a clue that nonlocal interactions are present.</p>
<p>The alternative we did not discuss is to split the learning data into morphemes, and learn phonotactics over these morphemes. For Aymara, a plausible learning data set that would reveal the right regularities would be a corpus of roots. This is the learning data we used in Gouskova and Gallagher (<xref ref-type="bibr" rid="B24">to appear</xref>). The benefit of using roots as learning data is that the learner may use just one simple cue, segmental placeholder trigrams, without attending to morpheme boundaries or reifying them to the representational level of segments with feature values. This would essentially use a <italic>sublexicon</italic> to learn morpheme-level phonotactics that hold over just a subset of the language&#8217;s forms (<xref ref-type="bibr" rid="B25">Gouskova and Becker 2013</xref>; <xref ref-type="bibr" rid="B6">Becker and Gouskova 2016</xref>). The main reason we did not use a sublexicon model here is that it is not clear how to define phonotactics over bound roots; verbal roots in Aymara are obligatorily suffixed. The application of phonotactic learning to words is straightforward to implement, whether they are embedded in connected speech or taken as a lexicon-like list. On the other hand, bound roots and suffixes are not complete, pronounceable phonological objects, and learning over such entities would be one level of abstraction removed from a realistic learning scenario.</p>
<p>An anonymous reviewer suggests an alternative to using morpheme boundaries: marking segments of the root with a [&#177;root] feature. This would effectively allow the learner to capture the root-internal cooccurrence restriction by including [+root] in the constraint. One problem with this move is computational implementation: the addition of this binary feature would mean a larger set of natural classes (and therefore many more constraints) for the learner to analyze. As for capturing the right level of generalization, this would be fitting for a language like Quechua, where affixes do not have laryngeals at all (motivating *[&#8211;root, +cg] and *[&#8211;root, +sg]). But in the case of Aymara, the restrictions do hold both inside roots and inside suffixes, so this would probably not be sufficient &#8211; Aymara requires something more along the lines of McCarthy&#8217;s (<xref ref-type="bibr" rid="B34">1989</xref>) planar separation for morphemes, so they dwell on different planes and escape cooccurrence restrictions.</p>
</sec>
<sec>
<title>5.5 Cue-based learning</title>
<p>Our approach uses properties of the learning data to detect other properties of the language, and as such, it can be considered to be an example of cue-based learning (<xref ref-type="bibr" rid="B19">Dresher and Kaye 1990</xref>; <xref ref-type="bibr" rid="B21">Gibson and Wexler 1994</xref>; <xref ref-type="bibr" rid="B18">Dresher 1999</xref>). A critique of cue-based learning is that it assumes a lot of learning machinery specific to language (<xref ref-type="bibr" rid="B36">Nazarov and Jarosz 2017</xref>). We do believe that the problem of boundary-sensitive phonotactics is a fairly language-specific one, and it is not immediately clear how one would approach it without recognizing morphemes as separate pieces with boundaries.</p>
<p>The logic of inducing nonlocal interactions from trigrams is not strictly phonological: one could argue for a domain-general status of the deduction that if A and B cannot cooccur nearby in a configuration A-X-B, it worthwhile to check whether A and B can cooccur at longer distances. Our learner uses an independent criterion to check whether A and B can interact at all &#8211; they must be part of a natural class, as defined by the language&#8217;s phonological contrasts and alternation system. This requirement, along with the requirement that constraints in the grammar must receive robust statistical support, minimizes the learner&#8217;s ability to notice nonlocal interactions between unrelated segments, which most linguists would describe as accidental.</p>
</sec>
</sec>
<sec>
<title>6 Conclusion</title>
<p>This paper has examined a set of morphologically sensitive, nonlocal restrictions in Aymara in a corpus study, behavioral experiments, and a computational model. Our corpus results showed that the restricted combinations are not categorical, even tautomorphemically, though there is an asymmetry between tautomorphemic and heteromorphemic combinations. We show that both the nonlocality and the morphological sensitivity of the restrictions is observable from trigram constraints in a grammar trained on just the linear string of segments. By building projections based on these morpheme boundary trigram constraints, our model captures the range of restrictions reported in the language. The induced grammar succeeds through a combination of general constraints on relatively large classes, like the class of stops or the class of ejectives, and more specific constraints on individual segments.</p>
<p>Our experimental work supports the traditional description of the phonotactics of the language, and our modeling work shows that nonlocal restrictions are learnable inductively, by attuning to properties of the phonotactics of the linear string. By comparing a phonotactic grammar trained on parsed vs. unparsed data, we saw that the patterns are largely obscured by the exceptions found across morpheme boundaries &#8211; only one of the restricted combinations is observable as a baseline trigram in the unparsed corpus, and only at an extremely low gamma. With these settings, the model is not generally distinguishing between phonologically meaningful gaps and accidental gaps and achieves a poor fit to the data. We hope that future work will incorporate the learning of morphological boundaries and phonotactics into a single model.</p>
</sec>
<sec sec-type="supplementary-material">
<title>Additional Files</title>
<p>The additional files for this article can be found as follows:</p>
<list list-type="bullet">
<list-item><p>Materials for the experiments in Section 3 are available at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://hdl.handle.net/2451/43661">http://hdl.handle.net/2451/43661</ext-link>.</p></list-item>
<list-item><p>Materials for the computational learning simulations in Section 4 are available at <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://github.com/gouskova/inductive_projection_learner">https://github.com/gouskova/inductive_projection_learner</ext-link>.</p></list-item>
</list>
</sec>
</body>
<back>
<fn-group>
<fn id="n1"><p>The difference between &#8220;tier&#8221; and &#8220;projection&#8221; has to do with representational assumptions: the term <italic>tier</italic> has historically meant a level of a structured autosegmental representation, whereas a <italic>projection</italic> is merely a representation that includes all and only the members of some class, e.g., all the vowels in a word on a [+syllabic] projection. We adopt the latter term following Hayes and Wilson (<xref ref-type="bibr" rid="B28">2008</xref>), see also Clements (<xref ref-type="bibr" rid="B12">1976</xref>); Goldsmith (<xref ref-type="bibr" rid="B22">1976</xref>); McCarthy (<xref ref-type="bibr" rid="B33">1979</xref>); Archangeli (<xref ref-type="bibr" rid="B2">1985</xref>); McCarthy (<xref ref-type="bibr" rid="B34">1989</xref>) and many others.</p></fn>
<fn id="n2"><p>Mid vowels here and throughout are allophonic, triggered by the presence of a preceding or following uvular consonant. While these examples show a suffix with an ejective or affricate attaching directly to a root, these suffixes may attach after other suffixes as well (indeed, this is more frequent than attachment to a root in our corpus).</p></fn>
<fn id="n3"><p>The corpus posted on the website is cut off at 50,000 forms. The full version of the corpus was obtained via personal communication with the developers.</p></fn>
<fn id="n4"><p>An anonymous reviewer reports that in Peruvian Aymara, forms with two identical ejectives can variably be produced with just an initial ejective, e.g., [t&#8217;ant&#8217;a] &#126; [t&#8217;anta]. If this is also true in Bolivian Aymara, knowledge of this alternation may influence participants to choose C<sub>2</sub> de-ejectivization as a strategy to repair pairs of non-identical ejectives.</p></fn>
<fn id="n5"><p>There are several roots that do not undergo syncope (recall 2.1) when suffixed with [-t&#8217;a] in our corpus: [hawi], [ana], [qo&#654;a], and [wajka].</p></fn>
<fn id="n6"><p>Morpheme boundary symbols are the simplest implementation for this, but there are of course alternatives. For a discussion of theoretical and learnability issues, see Pyle (<xref ref-type="bibr" rid="B38">1972</xref>); McCarthy (<xref ref-type="bibr" rid="B34">1989</xref>); Beckman (<xref ref-type="bibr" rid="B7">1997</xref>); Adriaans and Kager (<xref ref-type="bibr" rid="B1">2010</xref>); Becker and Allen (<xref ref-type="bibr" rid="B5">submitted</xref>); Kastner and Adriaans (<xref ref-type="bibr" rid="B30">2018</xref>); and others.</p></fn>
<fn id="n7"><p>The Java learner has a technical limitation: it cannot handle strings longer than about 40 characters. The maximum string length is used in generating the &#8220;sample salad&#8221; of phoneme strings that the learner compares to the learning data in figuring out what is missing from the learning data, and it must do so in finite time, so the shorter the words in the learning data, the better. See Daland (<xref ref-type="bibr" rid="B14">2015</xref>) for more.</p></fn>
<fn id="n8"><p>An anonymous reviewer asks whether the morpheme boundary marker was included in the feature set for the Induced Unparsed Model, despite not being present in the learning data. The presence of a segment in the feature file could influence the learning process, since the model uses the feature set to randomly sample the expected distribution of the given segments and compare that to the observed distribution in the data. We ran models under both conditions, and got similar results. Table <xref ref-type="table" rid="T20">20</xref> reports models with the morpheme boundary symbol in the feature set.</p></fn>
<fn id="n9"><p>We thank two anonymous reviewers for pointing out the importance of syncope, leading to the discussion in this section.</p></fn>
</fn-group>
<ack>
<title>Acknowledgements</title>
<p>For feedback on this and related work, we would like to thank audiences at NYU, AMP 6 at UC San Diego, and NELS 49 at Cornell. We would like to thank Ildik&#243; Emese Szab&#243; for assistance with building the corpus, and Colin Wilson for sharing the code for the gain-based version of the MaxEnt Phonotactic Learner.</p>
</ack>
<sec>
<title>Funding Information</title>
<p>This work was funded by NSF BCS-1724753 to the first two authors.</p>
</sec>
<sec>
<title>Competing Interests</title>
<p>The authors have no competing interests to declare.</p>
</sec>
<ref-list>
<ref id="B1"><label>1</label><mixed-citation publication-type="journal"><string-name><surname>Adriaans</surname>, <given-names>Frans</given-names></string-name> &amp; <string-name><given-names>Ren&#233;</given-names> <surname>Kager</surname></string-name>. <year>2010</year>. <article-title>Adding generalization to statistical learning: The induction of phonotactics from continuous speech</article-title>. <source>Journal of Memory and Language</source> <volume>62</volume>. <fpage>311</fpage>&#8211;<lpage>331</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/j.jml.2009.11.007</pub-id></mixed-citation></ref>
<ref id="B2"><label>2</label><mixed-citation publication-type="journal"><string-name><surname>Archangeli</surname>, <given-names>Diana</given-names></string-name>. <year>1985</year>. <article-title>Yokuts harmony: Evidence for coplanar representation in nonlinear phonology</article-title>. <source>Linguistic Inquiry</source> <volume>16</volume>. <fpage>335</fpage>&#8211;<lpage>372</lpage>.</mixed-citation></ref>
<ref id="B3"><label>3</label><mixed-citation publication-type="journal"><string-name><surname>Baker</surname>, <given-names>C. L.</given-names></string-name> <year>1979</year>. <article-title>Syntactic theory and the projection problem</article-title>. <source>Linguistic Inquiry</source> <volume>10</volume>. <fpage>533</fpage>&#8211;<lpage>581</lpage>.</mixed-citation></ref>
<ref id="B4"><label>4</label><mixed-citation publication-type="webpage"><string-name><surname>Bates</surname>, <given-names>Douglas</given-names></string-name>, <string-name><given-names>Martin</given-names> <surname>Maechler</surname></string-name>, <string-name><given-names>Ben</given-names> <surname>Bolker</surname></string-name> &amp; <string-name><given-names>Steven</given-names> <surname>Walker</surname></string-name>. <year>2014</year>. <article-title>lme4: Linear mixed-effects models using S4 classes</article-title>. <uri>http://CRAN.R-project.org/package=lme4</uri>, R package version 1.7.</mixed-citation></ref>
<ref id="B5"><label>5</label><mixed-citation publication-type="webpage"><string-name><surname>Becker</surname>, <given-names>Michael</given-names></string-name> &amp; <string-name><given-names>Blake</given-names> <surname>Allen</surname></string-name>. Submitted. <article-title>Learning alternations from surface forms with sublexical phonology</article-title>. <source>Phonology</source>. <uri>http://ling.auf.net/lingbuzz/002503</uri>.</mixed-citation></ref>
<ref id="B6"><label>6</label><mixed-citation publication-type="journal"><string-name><surname>Becker</surname>, <given-names>Michael</given-names></string-name> &amp; <string-name><given-names>Maria</given-names> <surname>Gouskova</surname></string-name>. <year>2016</year>. <article-title>Source-oriented generalizations as grammar inference in Russian vowel deletion</article-title>. <source>Linguistic Inquiry</source> <volume>47</volume>. <fpage>391</fpage>&#8211;<lpage>425</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/LING_a_00217</pub-id></mixed-citation></ref>
<ref id="B7"><label>7</label><mixed-citation publication-type="journal"><string-name><surname>Beckman</surname>, <given-names>Jill</given-names></string-name>. <year>1997</year>. <article-title>Positional faithfulness, positional neutralization, and Shona vowel harmony</article-title>. <source>Phonology</source> <volume>14</volume>. <fpage>1</fpage>&#8211;<lpage>46</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/S0952675797003308</pub-id></mixed-citation></ref>
<ref id="B8"><label>8</label><mixed-citation publication-type="thesis"><string-name><surname>Bennett</surname>, <given-names>William</given-names></string-name>. <year>2013</year>. <source>Dissimilation, consonant harmony and surface correspondence</source>. <publisher-loc>Rutgers, NJ</publisher-loc>: <publisher-name>Rutgers University</publisher-name> dissertation.</mixed-citation></ref>
<ref id="B9"><label>9</label><mixed-citation publication-type="journal"><string-name><surname>Berent</surname>, <given-names>Iris</given-names></string-name>, <string-name><given-names>Colin</given-names> <surname>Wilson</surname></string-name>, <string-name><given-names>Gary</given-names> <surname>Marcus</surname></string-name> &amp; <string-name><given-names>Doug</given-names> <surname>Bemis</surname></string-name>. <year>2012</year>. <article-title>On the role of variables in phonology: Remarks on Hayes and Wilson (2008)</article-title>. <source>Linguistic Inquiry</source> <volume>43</volume>. <fpage>97</fpage>&#8211;<lpage>119</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/LING_a_00075</pub-id></mixed-citation></ref>
<ref id="B10"><label>10</label><mixed-citation publication-type="book"><string-name><surname>Bowerman</surname>, <given-names>Melissa</given-names></string-name>. <year>1988</year>. <chapter-title>The &#8220;no negative evidence&#8221; problem: How do children avoid constructing an overly general grammar</chapter-title>. In <string-name><given-names>John A.</given-names> <surname>Hawkins</surname></string-name> (ed.), <source>Explaining language universals</source>, <fpage>73</fpage>&#8211;<lpage>101</lpage>. <publisher-loc>Oxford, UK</publisher-loc>: <publisher-name>Basil Blackwell</publisher-name>.</mixed-citation></ref>
<ref id="B11"><label>11</label><mixed-citation publication-type="book"><string-name><surname>Chomsky</surname>, <given-names>Noam</given-names></string-name> &amp; <string-name><given-names>Morris</given-names> <surname>Halle</surname></string-name>. <year>1968</year>. <source>The sound pattern of English</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Harper &amp; Row</publisher-name>.</mixed-citation></ref>
<ref id="B12"><label>12</label><mixed-citation publication-type="confproc"><string-name><surname>Clements</surname>, <given-names>George N.</given-names></string-name> <year>1976</year>. <article-title>Palatalization: Linking or assimilation?</article-title> In <string-name><given-names>Salikoko S.</given-names> <surname>Mufwene</surname></string-name>, <string-name><given-names>Carol A.</given-names> <surname>Walker</surname></string-name> &amp; <string-name><given-names>Sanford B.</given-names> <surname>Steever</surname></string-name> (eds.), <conf-name>Papers from Chicago Linguistic Society</conf-name> <volume>12</volume>, <fpage>96</fpage>&#8211;<lpage>109</lpage>. <conf-loc>Chicago, IL</conf-loc>: <conf-sponsor>Chicago Linguistic Society</conf-sponsor>.</mixed-citation></ref>
<ref id="B13"><label>13</label><mixed-citation publication-type="confproc"><string-name><surname>Cser</surname>, <given-names>Andr&#225;s</given-names></string-name>. <year>2010</year>. <article-title>The alis/aris allomorphy revisited</article-title>. In <string-name><given-names>Franz</given-names> <surname>Rainer</surname></string-name>, <string-name><given-names>Wolfgang</given-names> <surname>Dressler</surname></string-name>, <string-name><given-names>Dieter</given-names> <surname>Kastovsky</surname></string-name> &amp; <string-name><given-names>Hans</given-names> <surname>Luschuetzky</surname></string-name> (eds.), <conf-name>Variation and change in morphology: Selected papers from the 13th international morphology meeting, Vienna</conf-name>, <fpage>33</fpage>&#8211;<lpage>51</lpage>. <conf-loc>Amsterdam and Philadelphia</conf-loc>: <conf-sponsor>John Benjamins</conf-sponsor>. DOI: <pub-id pub-id-type="doi">10.1075/cilt.310.02cse</pub-id></mixed-citation></ref>
<ref id="B14"><label>14</label><mixed-citation publication-type="journal"><string-name><surname>Daland</surname>, <given-names>Robert</given-names></string-name>. <year>2015</year>. <article-title>Long words in maximum entropy phonotactic grammars</article-title>. <source>Phonology</source> <volume>32</volume>. <fpage>353</fpage>&#8211;<lpage>383</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/S0952675715000251</pub-id></mixed-citation></ref>
<ref id="B15"><label>15</label><mixed-citation publication-type="journal"><string-name><surname>Daland</surname>, <given-names>Robert</given-names></string-name>, <string-name><given-names>Bruce</given-names> <surname>Hayes</surname></string-name>, <string-name><given-names>James</given-names> <surname>White</surname></string-name>, <string-name><given-names>Marc</given-names> <surname>Garellek</surname></string-name>, <string-name><given-names>Andrea</given-names> <surname>Davis</surname></string-name> &amp; <string-name><given-names>Ingrid</given-names> <surname>Norrmann</surname></string-name>. <year>2011</year>. <article-title>Explaining sonority projection effects</article-title>. <source>Phonology</source> <volume>28</volume>. <fpage>197</fpage>&#8211;<lpage>234</lpage>. DOI: <pub-id pub-id-type="doi">10.1017/S0952675711000145</pub-id></mixed-citation></ref>
<ref id="B16"><label>16</label><mixed-citation publication-type="journal"><string-name><surname>Della Pietra</surname>, <given-names>Stephen</given-names></string-name>, <string-name><given-names>Vincent Della</given-names> <surname>Pietra</surname></string-name> &amp; <string-name><given-names>John</given-names> <surname>Lafferty</surname></string-name>. <year>1997</year>. <article-title>Inducing features of random fields</article-title>. <source>IEEE transactions on pattern analysis and machine intelligence</source> <volume>19</volume>. <fpage>380</fpage>&#8211;<lpage>393</lpage>. DOI: <pub-id pub-id-type="doi">10.1109/34.588021</pub-id></mixed-citation></ref>
<ref id="B17"><label>17</label><mixed-citation publication-type="book"><string-name><surname>de Lucca</surname>, <given-names>Manuel</given-names></string-name>. <year>1987</year>. <source>Diccionario pr&#225;ctico Aymara-Espa&#241;ol, Espa&#241;ol Ayamara</source>. <publisher-loc>Cochabamba, Bolivia</publisher-loc>: <publisher-name>Los Amigos del Libros</publisher-name>.</mixed-citation></ref>
<ref id="B18"><label>18</label><mixed-citation publication-type="journal"><string-name><surname>Dresher</surname>, <given-names>Elan</given-names></string-name>. <year>1999</year>. <article-title>Charting the learning path: Cues to parameter setting</article-title>. <source>Linguistic Inquiry</source> <volume>30</volume>. <fpage>27</fpage>&#8211;<lpage>67</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/002438999553959</pub-id></mixed-citation></ref>
<ref id="B19"><label>19</label><mixed-citation publication-type="journal"><string-name><surname>Dresher</surname>, <given-names>Elan</given-names></string-name> &amp; <string-name><given-names>Jonathan</given-names> <surname>Kaye</surname></string-name>. <year>1990</year>. <article-title>A computational learning model for metrical phonology</article-title>. <source>Cognition</source> <volume>34</volume>. <fpage>137</fpage>&#8211;<lpage>195</lpage>. DOI: <pub-id pub-id-type="doi">10.1016/0010-0277(90)90042-I</pub-id></mixed-citation></ref>
<ref id="B20"><label>20</label><mixed-citation publication-type="journal"><string-name><surname>Gallagher</surname>, <given-names>Gillian</given-names></string-name>. <year>2016</year>. <article-title>Asymmetries in the representation of categorical phonotactics</article-title>. <source>Language</source> <volume>92</volume>. <fpage>557</fpage>&#8211;<lpage>590</lpage>. DOI: <pub-id pub-id-type="doi">10.1353/lan.2016.0048</pub-id></mixed-citation></ref>
<ref id="B21"><label>21</label><mixed-citation publication-type="journal"><string-name><surname>Gibson</surname>, <given-names>Edward</given-names></string-name> &amp; <string-name><given-names>Kenneth</given-names> <surname>Wexler</surname></string-name>. <year>1994</year>. <article-title>Triggers</article-title>. <source>Linguistic Inquiry</source> <volume>25</volume>. <fpage>407</fpage>&#8211;<lpage>454</lpage>.</mixed-citation></ref>
<ref id="B22"><label>22</label><mixed-citation publication-type="thesis"><string-name><surname>Goldsmith</surname>, <given-names>John</given-names></string-name>. <year>1976</year>. <source>Autosegmental phonology</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>Massachusetts Institute of Technology</publisher-name> dissertation.</mixed-citation></ref>
<ref id="B23"><label>23</label><mixed-citation publication-type="confproc"><string-name><surname>Goldwater</surname>, <given-names>Sharon</given-names></string-name> &amp; <string-name><given-names>Mark</given-names> <surname>Johnson</surname></string-name>. <year>2003</year>. <article-title>Learning OT constraint rankings using a maximum entropy model</article-title>. In <string-name><given-names>Jennifer</given-names> <surname>Spenader</surname></string-name>, <string-name><given-names>Anders</given-names> <surname>Eriksson</surname></string-name> &amp; <string-name><given-names>&#214;sten</given-names> <surname>Dahl</surname></string-name> (eds.), <conf-name>Proceedings of the Stockholm workshop on variation within Optimality Theory</conf-name>, <fpage>111</fpage>&#8211;<lpage>120</lpage>. <conf-loc>Stockholm</conf-loc>: <conf-sponsor>Stockholm University</conf-sponsor>.</mixed-citation></ref>
<ref id="B24"><label>24</label><mixed-citation publication-type="journal"><string-name><surname>Gouskova</surname>, <given-names>Maria</given-names></string-name> &amp; <string-name><given-names>Gillian</given-names> <surname>Gallagher</surname></string-name>. To appear. <article-title>Inducing nonlocal constraints from baseline phonotactics</article-title>. <source>Natural Language and Linguistic Theory</source>.</mixed-citation></ref>
<ref id="B25"><label>25</label><mixed-citation publication-type="journal"><string-name><surname>Gouskova</surname>, <given-names>Maria</given-names></string-name> &amp; <string-name><given-names>Michael</given-names> <surname>Becker</surname></string-name>. <year>2013</year>. <article-title>Nonce words show that Russian yer alternations are governed by the grammar</article-title>. <source>Natural Language and Linguistic Theory</source> <volume>31</volume>. <fpage>735</fpage>&#8211;<lpage>765</lpage>. DOI: <pub-id pub-id-type="doi">10.1007/s11049-013-9197-5</pub-id></mixed-citation></ref>
<ref id="B26"><label>26</label><mixed-citation publication-type="book"><string-name><surname>Hardman</surname>, <given-names>Martha James</given-names></string-name>. <year>2001</year>. <source>Aymara</source>. <publisher-loc>Munich</publisher-loc>: <publisher-name>Lincom Europa</publisher-name>.</mixed-citation></ref>
<ref id="B27"><label>27</label><mixed-citation publication-type="book"><string-name><surname>Hayes</surname>, <given-names>Bruce</given-names></string-name>. <year>2004</year>. <chapter-title>Phonological acquisition in Optimality Theory: The early stages</chapter-title>. In <string-name><given-names>Ren&#233;</given-names> <surname>Kager</surname></string-name>, <string-name><given-names>Joe</given-names> <surname>Pater</surname></string-name> &amp; <string-name><given-names>Wim</given-names> <surname>Zonnevald</surname></string-name> (eds.), <source>Fixing priorities: Constraints in phonological acquisition</source>, <fpage>158</fpage>&#8211;<lpage>203</lpage>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation></ref>
<ref id="B28"><label>28</label><mixed-citation publication-type="journal"><string-name><surname>Hayes</surname>, <given-names>Bruce</given-names></string-name> &amp; <string-name><given-names>Colin</given-names> <surname>Wilson</surname></string-name>. <year>2008</year>. <article-title>A maximum entropy model of phonotactics and phonotactic learning</article-title>. <source>Linguistic Inquiry</source> <volume>39</volume>. <fpage>379</fpage>&#8211;<lpage>440</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/ling.2008.39.3.379</pub-id></mixed-citation></ref>
<ref id="B29"><label>29</label><mixed-citation publication-type="journal"><string-name><surname>Hayes</surname>, <given-names>Bruce</given-names></string-name> &amp; <string-name><given-names>James</given-names> <surname>White</surname></string-name>. <year>2013</year>. <article-title>Phonological naturalness and phonotactic learning</article-title>. <source>Linguistic Inquiry</source> <volume>44</volume>. <fpage>45</fpage>&#8211;<lpage>75</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/LING_a_00119</pub-id></mixed-citation></ref>
<ref id="B30"><label>30</label><mixed-citation publication-type="journal"><string-name><surname>Kastner</surname>, <given-names>Itamar</given-names></string-name> &amp; <string-name><given-names>Frans</given-names> <surname>Adriaans</surname></string-name>. <year>2018</year>. <article-title>Linguistic constraints on statistical word segmentation: The role of constraints in Arabic and English</article-title>. <source>Cognitive Science</source> <volume>42</volume>. <fpage>494</fpage>&#8211;<lpage>518</lpage>. DOI: <pub-id pub-id-type="doi">10.1111/cogs.12521</pub-id></mixed-citation></ref>
<ref id="B31"><label>31</label><mixed-citation publication-type="thesis"><string-name><surname>MacEachern</surname>, <given-names>Margaret</given-names></string-name>. <year>1997</year>. <source>Laryngeal cooccurrence restrictions</source>. <publisher-loc>Los Angeles, CA</publisher-loc>: <publisher-name>University of California, Los Angeles</publisher-name> dissertation.</mixed-citation></ref>
<ref id="B32"><label>32</label><mixed-citation publication-type="thesis"><string-name><surname>Martin</surname>, <given-names>Andrew</given-names></string-name>. <year>2007</year>. <source>The evolving lexicon</source>. <publisher-loc>Los Angeles, CA</publisher-loc>: <publisher-name>University of California, Los Angeles</publisher-name> dissertation.</mixed-citation></ref>
<ref id="B33"><label>33</label><mixed-citation publication-type="thesis"><string-name><surname>McCarthy</surname>, <given-names>John</given-names></string-name>. <year>1979</year>. <source>Formal problems in Semitic phonology and morphology</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>Massachusetts Institute of Technology</publisher-name> dissertation.</mixed-citation></ref>
<ref id="B34"><label>34</label><mixed-citation publication-type="journal"><string-name><surname>McCarthy</surname>, <given-names>John</given-names></string-name>. <year>1989</year>. <article-title>Linear order in phonological representation</article-title>. <source>Linguistic Inquiry</source> <volume>20</volume>. <fpage>71</fpage>&#8211;<lpage>99</lpage>.</mixed-citation></ref>
<ref id="B35"><label>35</label><mixed-citation publication-type="journal"><string-name><surname>McQueen</surname>, <given-names>James</given-names></string-name>. <year>1998</year>. <article-title>Segmentation of continuous speech using phonotactics</article-title>. <source>Journal of Memory and Language</source> <volume>39</volume>. <fpage>21</fpage>&#8211;<lpage>46</lpage>. DOI: <pub-id pub-id-type="doi">10.1006/jmla.1998.2568</pub-id></mixed-citation></ref>
<ref id="B36"><label>36</label><mixed-citation publication-type="book"><string-name><surname>Nazarov</surname>, <given-names>Alexei</given-names></string-name> &amp; <string-name><given-names>Gaja</given-names> <surname>Jarosz</surname></string-name>. <year>2017</year>. <chapter-title>Learning parametric stress without domain-specific mechanisms</chapter-title>. In <string-name><given-names>Karen</given-names> <surname>Jesney</surname></string-name>, <string-name><given-names>Charlie</given-names> <surname>O&#8217;Hara</surname></string-name>, <string-name><given-names>Caitlin</given-names> <surname>Smith</surname></string-name> &amp; <string-name><given-names>Rachel</given-names> <surname>Walker</surname></string-name> (eds.), <source>Proceedings of the Annual Meeting on Phonology 2016</source>. <publisher-name>Linguistic Society of America</publisher-name>. DOI: <pub-id pub-id-type="doi">10.3765/amp.v4i0.4010</pub-id></mixed-citation></ref>
<ref id="B37"><label>37</label><mixed-citation publication-type="book"><string-name><surname>Prince</surname>, <given-names>Alan</given-names></string-name> &amp; <string-name><given-names>Bruce</given-names> <surname>Tesar</surname></string-name>. <year>2004</year>. <chapter-title>Learning phonotactic distributions</chapter-title>. In <string-name><given-names>Ren&#233;</given-names> <surname>Kager</surname></string-name>, <string-name><given-names>Joe</given-names> <surname>Pater</surname></string-name> &amp; <string-name><given-names>Wim</given-names> <surname>Zonnevald</surname></string-name> (eds.), <source>Fixing priorities: Constraints in phonological acquisition</source>, <fpage>245</fpage>&#8211;<lpage>291</lpage>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</mixed-citation></ref>
<ref id="B38"><label>38</label><mixed-citation publication-type="confproc"><string-name><surname>Pyle</surname>, <given-names>Charles</given-names></string-name>. <year>1972</year>. <article-title>On eliminating BMs</article-title>. In <string-name><given-names>Paul</given-names> <surname>Peranteau</surname></string-name>, <string-name><given-names>Judith</given-names> <surname>Levi</surname></string-name> &amp; <string-name><given-names>Gloria</given-names> <surname>Phares</surname></string-name> (eds.), <conf-name>Papers from Chicago Linguistic Society</conf-name> <volume>8</volume>, <fpage>516</fpage>&#8211;<lpage>532</lpage>. <conf-loc>Chicago, IL</conf-loc>: <conf-sponsor>Chicago Linguistic Society</conf-sponsor>.</mixed-citation></ref>
<ref id="B39"><label>39</label><mixed-citation publication-type="webpage"><collab>R development core team</collab>. <year>2018</year>. <chapter-title>R: A language and environment for statistical computing</chapter-title>. <publisher-loc>Vienna, Austria</publisher-loc>. <uri>http://www.R-project.org</uri>.</mixed-citation></ref>
<ref id="B40"><label>40</label><mixed-citation publication-type="confproc"><string-name><surname>Steriade</surname>, <given-names>Donca</given-names></string-name>. <year>1987</year>. <article-title>Redundant values</article-title>. In <string-name><given-names>Anna</given-names> <surname>Bosch</surname></string-name>, <string-name><given-names>Barbara</given-names> <surname>Need</surname></string-name> &amp; <string-name><given-names>Eric</given-names> <surname>Schiller</surname></string-name> (eds.), <conf-name>Papers from Chicago Linguistic Society</conf-name> <volume>23</volume>, <fpage>339</fpage>&#8211;<lpage>362</lpage>. <conf-loc>Chicago, IL</conf-loc>: <conf-sponsor>Chicago Linguistic Society</conf-sponsor>.</mixed-citation></ref>
<ref id="B41"><label>41</label><mixed-citation publication-type="journal"><string-name><surname>Suomi</surname>, <given-names>Kari</given-names></string-name>, <string-name><given-names>James</given-names> <surname>McQueen</surname></string-name> &amp; <string-name><given-names>Anne</given-names> <surname>Cutler</surname></string-name>. <year>1997</year>. <article-title>Vowel harmony and speech segmentation in Finnish</article-title>. <source>Journal of Memory and Language</source> <volume>36</volume>. <fpage>422</fpage>&#8211;<lpage>444</lpage>. DOI: <pub-id pub-id-type="doi">10.1006/jmla.1996.2495</pub-id></mixed-citation></ref>
<ref id="B42"><label>42</label><mixed-citation publication-type="thesis"><string-name><surname>Suzuki</surname>, <given-names>Keiichiro</given-names></string-name>. <year>1998</year>. <source>A typological investigation of dissimilation</source>. <publisher-loc>Tucson, AZ</publisher-loc>: <publisher-name>University of Arizona, Tucson</publisher-name> dissertation.</mixed-citation></ref>
<ref id="B43"><label>43</label><mixed-citation publication-type="book"><string-name><surname>Trubetzkoy</surname>, <given-names>Nikolai</given-names></string-name>. <year>1939</year>. <source>Grundz&#252;ge der phonologie</source> [Foundations of phonology]. <publisher-loc>Prague</publisher-loc>: <publisher-name>Travaux du cercle linguistique de Prague</publisher-name>.</mixed-citation></ref>
<ref id="B44"><label>44</label><mixed-citation publication-type="journal"><string-name><surname>Wilson</surname>, <given-names>Colin</given-names></string-name> &amp; <string-name><given-names>Gillian</given-names> <surname>Gallagher</surname></string-name>. <year>2018</year>. <article-title>Accidental gaps and surface-based phonotactic learning: A case study of South Bolivian Quechua</article-title>. <source>Linguistic Inquiry</source> <volume>49</volume>. <fpage>610</fpage>&#8211;<lpage>623</lpage>. DOI: <pub-id pub-id-type="doi">10.1162/ling_a_00285</pub-id></mixed-citation></ref>
</ref-list>
</back>
</article>