Overabundance and inflectional classification: Quantitative evidence from Czech

Overabundance is the situation where two or more distinct word forms fill the same cell in an inflectional paradigm (Thornton 2011). While this topic has received renewed attention in recent years, there are still several open questions regarding its properties and status. In this paper we present a new take on the matter. On the basis of a case study of the locative singular and instrumental plural of Czech nouns, we argue that there are at least two kinds of overabundance phenomena which should be distinguished, depending on whether overabundant behavior integrates in the inflection system or is orthogonal to it. The evidence for the distinction comes from a quantitative study of the way phonological, morphosyntactic, semantic, and sociolinguistic factors contribute to partially predicting whether a lexeme is overabundant and which form is used in different contexts.


Introduction
Overabundance is the situation where two or more distinct wordforms fill the same cell in an inflectional paradigm (Thornton 2011). In (1) we see examples for the imperfective subjunctive in Spanish, which can be realized by either -se or -ra markers (DeMello 1993 While the phenomenon is well known and documented in many if not all languages with inflectional morphology, overabundance was mostly ignored by theoretical morphologists until the pioneering work of Thornton; it is telling that prominent theoretical works such as Anderson (1992) and Stump (2001) define architectures for inflectional morphology that presuppose overabundance not to exist, without any explicit discussion (Bonami & Boyé 2010). Although Thornton's efforts in the last decade (Thornton 2011;2019a;b) succeeded in putting the problem on the agenda, leading to a number of theoretical discussions (Stump 2016;Bonami & Crysmann 2018;Guzmán Naranjo 2019;Beniamine 2021) and renewed interest in detailed empirical studies (see among many others Bošnjak Botica & Hržica 2016;Cappellaro 2013;Lečić 2015;Rosemeyer & Schwenter 2019;Santilli 2014;Thornton 2012), some more general questions still remain unanswered. The clarified empirical landscape allowed Thornton (2019b) to start laying out a typology of overabundance. She identifies four main dimensions of variation in how overabundance manifests itself, which we may describe as follows. 12 (2) a.
Lexical prevalence: an overabundance phenomenon may affect a set of lexemes of any size, from a single lexeme to all members of the same part of speech. b.
Paradigmatic prevalence: an overabundance phenomenon 2 may affect a set of paradigm cells of any size, from a single cell to the whole paradigm. c.
Balance: the statistical distribution of rival forms may vary anywhere from a balanced distribution to a situation where the use of one of the two forms is barely attested. d.
Conditions: the use of rival forms may be subject to various kinds of conditions: (i) Usage conditions: geographical, sociolinguistic, and/or stylistic factors affect the preference for one or the other form. (ii) Grammatical conditions: the semantic, syntactic, morphological and/or phonological environment affects the preference for one or the other form.
One aspect of the typology of overabundance that Thornton does not discuss in detail is its interaction with the system of inflectional classification. Inflectional systems of any complexity exhibit differential inflectional behavior, where lexemes of the same part of speech use different marking strategies to contrast the forms filling cells of their inflectional paradigm. Systems of inflection classes are the tool of choice to explicate such variability, and recent research has highlighted how such systems are organized (Corbett & Fraser 1993;Dressler & Thornton 1996;Brown & Hippisley 2012;Beniamine, Bonami & Sagot 2017;Beniamine 2021) and how they tend to be partially but not fully motivated by other lexical properties (Aronoff 1994;Baayen & Moscoso del Prado Martín 2005;Guzmán Naranjo 2019). Overabundance may interact with 1 Thornton's typology is stated in terms of canonical criteria (Corbett 2007;Brown, Chumakina & Corbett 2013), and focuses on endpoints of the dimensions rather than describing the dimensions directly. We took the liberty of rephrasing Thornton's distinctions in terms that highlight the gradual nature of the scales rather than the endpoints.

2
This discussion is affected by what exactly one calls a single 'overabundance phenonmenon'. A strict definition classifies two instances of overabundance as the same phenomenon only if they exhibit the same form alternation, modulo regular morphophonology. Under this definition, Czech LOC.SG pairs listu~listě 'page' and bazénu~bazéně 'swimming pool' are instances of the same phenomenon, but the pair hostu~hostovi 'host' represents a distinct phenomenon. Using this strict definition, paradigmatic prevalence will generally be low, because it is rare for the same alternations to occur in multiple cells. Thornton however uses a more permissive definition when discussing paradigmatic prevalence, and just counts how many cells in the paradigm of the same lexeme are overabundant, whether the alternation is the same or not. This is clearly an area where the typology would benefit from being refined. In this paper we will alternate between these two definitions depending on context, hoping that it will make the text more readable without introducing much confusion. Guzmán Naranjo and Bonami Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1626 inflectional classification in a variety of ways. In the extreme case of systematic overabundance in Spanish imperfective subjunctives illustrated in (1), there is no interaction to speak of, since all lexemes are overabundant and overabundance manifests itself through the use of the exact same exponents across the lexicon. However this is not the only possibility. Even where overabundance is systematic, it may rely on different marking strategies depending on the inflection class. Where overabundance is found with a restricted set of lexemes, it interacts by definition with inflectional classification (it leads to differential inflectional behavior), but there are different conceivable ways in which it may do so. In particular, we may ask whether overabundant classes have the usual properties of inflection classes in terms of partial motivation.
In this paper we present a case study of two situations of overabundance in Czech nominal declension: occasional overabundance in the locative singular, and systematic overabundance in the instrumental plural. We deploy various quantitative techniques applied to lexical and corpus data to show how overabundance is embedded in the inflection class system in the first case, but orthogonal to that system in the second.
The structure of the paper is as follows. In Section 2 we present background information on the Czech declension system and how it is affected by overabundance. Section 3 presents a first study arguing for a qualitative difference between the two cases of overabundance: building on previous work on inflectional classification, we show that overabundant lexemes exhibit a specific pattern of partial motivation in the locative singular, suggesting that overabundant lexemes constitute a mixed class sharing properties with two classes of non-overabundant lexemes. By contrast, no such effect can be found in the instrumental plural. Section 4 presents a complementary study of the relationship between overabundance and case government in the locative singular. We document the fact that governing prepositions have preferences as to which variant of an overabundant lexeme they combine with, although no such effect can be found with non-overabundant lexemes. This indicates that, despite their mixed status in terms of motivation, overabundant lexemes form a class whose properties are not reducible to those of its non-overabundant neighbors. Hence they constitute a robust member of the inflection class system. Section 5 concludes the paper.

The nominal declension system
For the purposes of this paper, we will follow the description of Czech inflection in Cvrček et al. (2010), a careful revision of traditional descriptions based on extensive corpus evidence. This grammar uses evidence from corpora of edited text vs. spoken corpora to document in parallel the two language standards otherwise known as 'Literary Czech', 3 mostly used in formal writing, and 'Common Czech', mostly used in speech or informal contexts. These differences are only marginal in nominal declension, except in the case of the instrumental plural, as discussed below. The Czech nominal system distinguishes four grammatical genders (masculine inanimate, masculine animate, feminine and neuter), 4 seven cases (nominative, accusative, genitive, dative, vocative, locative, and instrumental) and two numbers (singular and plural). Nouns are divided up into declension classes which characterize distinct inflectional behaviors. Table 1 illustrates the 12 most prominent classes of Czech nouns according to Cvrček et al. (2010). 5 The 3 Sometimes also called 'Standard Czech', e.g. by Bermel (2000). See this monograph for a useful history of the codification of the distinction, and discussion of its complex relationship to actual sociolinguistic variation.

4
Masculine animate and masculine inanimate are separate genders since they trigger different agreement patterns; cf. vidím star-ého muže 'I see an old man' vs. vidím star-ý kříž 'I see an old cross'. Whether they should be considered subgenders of a superordinate masculine gender, in the sense of (Corbett 1991;, is a separate issue. We note that the evidence for this is weaker than in other Slavonic languages, with multiple case-number combinations in agreement targets distinguishing masculine animate from masculine inanimates (acc.sg, for all adjectives, nom.pl and voc.pl for hard adjectives, pl for past verbs), and, in those cells, systematic syncretism between masculine inanimate and feminine and/or neuter. Hence, while masculine inanimate agreement is more similar to masculine animate agreement than to feminine or neuter agreement, it is not entirely dissimilar to those, a fact that is not captured by the notion of a subgender.

5
Deciding on an exact number of inflection classes depends on the details of criteria for inflection class membership, a notoriously thorny issue; see Beniamine, Bonami & Sagot (2017) for recent discussion. In particular Czech has a number of alternation phenomena straddling the morphology-phonology interface (epenthetic e insertion, ů~o alternations, different varieties of palatalization) whose treatment within or outside the inflection system affects the number of postulated classes. Guzmán Naranjo and sense: /t/ is clearly a hard consonant, and i-stem neuters do not end in a consonant: we will follow their lead and take these two inflection classes to be outside the hard/soft opposition. In addition, the traditional bipartition does not exhaust inflection class distinctions; see e.g. the distinct behavior of host 'host' and táta 'dad' in the masculine animate, or růže 'rose' and krádež 'theft' in the feminine. Overall, inflection class fully determines gender (no two nouns of different genders inflect in exactly the same fashion), and correlates strongly with stemfinal consonant identity, but inflection class assignment is not fully predictable from gender, morphophonology, or a combination of the two. As a result of these and other observations, the Czech inflection class system is not readily describable in terms of an inheritance tree, and is best viewed as a multiple inheritance hierarchy; see Beniamine & Bonami (submitted) for elaboration of this point. Table 1 already illustrates the pervasive presence of overabundance in Czech declension, with 6 out of 14 paradigm cells having multiple forms for at least some nouns in this very small sample. It also illustrates the important fact that overabundant cells typically exploit casenumber suffixes also found with non-overabundant lexemes. For instance, the dative singular of host has two forms, combining the inflection strategies independently found with táta on the one hand (-ovi) and most on the other hand (-u). Finally, it is worth noting that while some cases of overabundance are inflection class dependent, overabundance is systematic in the instrumental plural: all nouns exhibit two distinct marking strategies, one of them involving the vowel /i/ (written as <y> or <i>) potentially preceded by some material, the other the sequence -ma, also potentially preceded by some material.

Overabundance
To get a better grasp of the importance of the phenomenon, we quantified the overall lexical prevalence of overabundance using attestations in corpus data. We used version 4 of the SYN corpus (Křen et al. 2016), a tagged and lemmatized 4.3 billion token corpus of edited text published between 1989 and 2014; see Hnátková et al. (2014) for a detailed description. 6 Note that, this being a corpus of edited text, more informal Common Czech forms are underrepresented in the corpus, although by no means absent, as we will discuss in Section 2.4.
We proceeded as follows. First, for each paradigm cell, we collected all wordforms attested in the corpus and tagged as belonging in that cell, and we noted which lemma they correspond to and the token frequency of that wordform filling that cell of that lemma. Table 2 reports as 'attested' the number of lexemes that are attested at least twice in the relevant cell. Second, we used simple pattern matching to identify the casenumber suffix in each word (if any), relying on the description of exponence provided by Cvrček et al. (2010). A lexeme is counted as 'overabundant' in a cell if at least two wordforms are found in that cell ending 6 The SYN corpus is the concatenation of a number of smaller corpora that are either representative of the overall production of Czech publishers in a given time period (SYN2000, SYN2005, SYN2010, SYN2015) or exclusively journalistic (SYN2006PUB, SYN2009PUB, SYN2013PUB). Unlike e.g. Bermel & Knittl (2012a) we chose to use the larger, less balanced corpus in the interest of a larger coverage, which is important if we want to be able to assess small proportions of use of alternate forms for a large number of lexemes.  in different suffixes (including the zero suffix). The proportion of overabundant lexemes among attested lexemes arguably provides a lower bound on the actual lexical prevalence of overabundance. 7 We thus find evidence for overabundance in all paradigm cells, with a proportion of overabundant lexemes varying between 0.25% in the ins.sg and 12.17% in the ins.pl. It is also striking that overabundant lexemes number in the thousands for 8 out of 14 paradigm cells, with the lowest numbers coinciding with paradigm cells that are also the least frequently attested in the corpus (e.g. vocative is barely found in a corpus that contains little dialogue). We conclude that overabundance is overwhelmingly attested in Czech, doing away with any doubt that one would be dealing with a minor phenomenon. It might be that Czech is unusual in that respect, and that the high prevalence of overabundance is linked to the language's particular diglossic history (see Bermel 2000 for a useful discussion). However, since to our knowledge the prevalence of overabundance has never been evaluated on a large scale for any other language at this date, there is currently no evidence to support such a claim.
With these very general observations in hand, we turn to two more specific case studies.

Hard masculine inanimate nouns in the loc.sg
The locative singular is home to a number of overabundance phenomena. We focus our presentation on the situation of hard masculine inanimate nouns, although similar points could be made about other parts of the system, and we will present some relevant analysis in section 3. Hard masculine inanimate nouns may use two different endings in the loc.sg: -u or -ě. 8 Some nouns are attested with both, and are hence overabundant. 9 (2) a. dub 'oak tree', loc.sg dubu dům 'house', loc.sg domě úřad 'office', loc.sg úřadu~úřadě In the SYN corpus we find that, among masculine inanimate nouns attested at least twice, 15959 nouns only appear with -u, 1056 only appear with -ě, and 2041 are found with both endings. Hence about 10% of the relevant nouns are undisputably overabundant. Sampling accidents may have led to finding attestations of only one of the two forms for other lexemes that are indeed overabundant: hence this 10% proportion should be taken as a lower bound to the true proportion of overabundant lexemes.

7
Our calculations are conservative in at least two ways. First, we only considered (lexeme,cell) pairs with at least two attestations, because if a pair is attested only once, there is no possibility of it having been seen in two distinct forms. But still, the relative frequency of alternants in cases of overabundance is very variable (see Bermel & Knittl 2012a and below). For lexemes for which we have enough attestations to document this, the typical situation is that one of the alternants is much more frequent than the other. As a result, lexemes with a smaller number of attestations that are actually overabundant are likely to be found with only one form in the corpus. Given Zipf's law, this situation is expected to be very common. Second, we purposefully refrained from counting as overabundant all (lexeme,cell) pairs found with two distinct wordforms, and used the more restrictive condition of having distinct suffixes. If we had done the former, we would have counted as cases of overabundance many instances of minute orthographic variation (e.g. the ins.sg of analýza 'analysis' spelled either analýzou or analyzou) that are unlikely to be morphologically relevant, and many of which are just spelling errors. Our reliance on the latter strategy avoids that pitfall, but may also lead to excluding some true cases of overabundance involving stem allomorphy.

8
More precisely, one of the options for the exponence of loc.sg is a morphophonological process that (i) palatalizes the stem-final consonant if that consonant enters palatalization alternations; and (ii) suffixes /e/. Since most consonants end up being palatalized, and Czech orthography mostly notes /e/ preceded by a palatalized consonant as <ě>, a -ě ending is the most frequent orthographic reflex of the relevant morphophonological process; the ending may also be -e, e.g. after a non-palatalizable consonant (e.g. kostel 'church', loc.sg kostele), or where orthography notes palatalization on the consonant rather than the vowel (e.g. jazyk 'tongue, language', loc.sg jazyce). All these cases are taken into account below, and for simplicity will be labelled as instances of the -ě ending.

9
A reviewer points out the interesting connection between the Czech situation and the phenomenon of second locatives in Russian (Brown 2007;Corbett 2012). While most Russian nouns have a single locative (also known as prepositional) form, some class 1 nouns have a form in -ú in addition to their ordinary locative in -e, with specialization as to which preposition selects which of the two locative cases. Although the two phenomena may have a common origin and have a strong family resemblance, it is worth pointing out the crucial differences: in Czech, both -ě and -u are the single available exponent for some nouns of the relevant subclass of hard masculine inanimates; and, where both forms are available with a single noun, there is no complementary distribution in terms of prepositional government, although there are interesting tendencies that we will discuss in Section 4. Guzmán Naranjo and To get a better grasp of this situation of overabundance, we examine how the proportion of use of -u vs. -ě varies across lexemes. Figure 1 shows the distribution of these proportions for lexemes attested at least 100 times in the corpus in the loc.sg, 10 and at least once with each of the two exponents.
The distribution is strikingly u-shaped: the vast majority of overabundant lexemes exhibit a strong preference for either -u or -ě. In fact, about half of the relevant lexemes are found 95% of the time or more with one of the two endings, and only 108 (about 5%) have no strong preference, with a proportion of -u between 40% and 60%.
What could be the source of this distribution? Two alternative hypotheses need to be considered. First, the corpus distribution could reflect true lexical variability: each lexeme has its own probabilistic level of preference for -u or -ě, with most lexemes exhibiting a strong preference for one or the other, whatever the cause of that preference. Alternatively, it could be that the observed distribution is a consequence of noisy data. Although we will conclude that the former is true, it is necessary to take the time to examine the latter hypothesis.
Suppose that each relevant noun truly has a single loc.sg form, but production errors introduce a bit of random noise: sometimes a speaker will incorrectly inflect a noun with the wrong suffix. For concreteness let us assume that such errors happen 1% of the time. Under that scenario, the observed proportion of use of -u for each lexeme corresponds to a different sample of one of two underlying processes. For each of the two processes, most samples will exhibit a proportion of use of -u close to the true proportion-by hypothesis, either 1% or 99%-but a few will by chance end up containing a disproportionate proportion of the 'wrong' form.
Such a story is appealing, as it explains away apparent overabundance. However, it makes a clear prediction that happens to be falsified. If the hypothesis was true, then the likelihood of a lexeme being seen with a balanced distribution of forms should decrease with the frequency of the lexeme. In other words, lexemes with balanced proportions of -u or -ě should have a markedly lower frequency than lexemes on the borders. As Figure 2 shows, this is not the case: the median frequency of lexemes with a more balanced distribution is not noticeably lower than that of lexemes with an imbalanced distribution.
In addition to this corpus evidence, Bermel & Knittl (2012a) provide experimental evidence for the same conclusion. In their experiment, speakers were asked to rate the acceptability within a series of syntactic contexts of the two locative singular forms, for nouns with different proportions of -u in a corpus. They found that the acceptability of an ending correlates positively with its proportion of use. In particular, lexemes with a balanced use of -u and -ě do not exhibit a marked preference in acceptability for one or the other ending.

Figure 1
Histogram of the by-lexeme proportion of use of -u in the locative singular for overabundant hard masculine inanimate nouns with a token frequency of 100 or more, in the SYN corpus. Proportions of exactly 0 and exacly 1 are excluded; the first bar (resp. the last bar) hence shows the number of lexemes with strictly more than 0% and at most 1% (resp. at least 99% and strictly less than 100%). Both macroscopic corpus evidence and microscopic experimental evidence thus lead us to conclude that the U-shaped distribution documented in Figure 1 reflects true lexical preferences: most overabundant lexemes have a marked preference for one or the other suffix, but some have a more balanced distribution. Note that, by saying that lexemes have lexical preferences, we do not exclude the possibility that these follow at least in part from general tendencies. The literature on Czech is replete with observations on phonological, morphological, syntactic, semantic, and sociolinguistic factors purported to have an influence-see Cummins (1995) for a review and Bermel & Knittl (2012a;b) as well as Bermel, Luďek Knittl & Russell (2015) and Bermel, Luděk Knittl & Russell (2018) for empirical evidence. In fact, the remainder of this paper will further document such conditioning. The important conclusion for the time being is that overabundance cannot be explained away as a consequence of such factors: there is a robust class of lexemes exhibiting variable inflectional behavior in the locative singular.

The instrumental plural
In Czech, all nouns may occur in two forms in the instrumental plural. Examples of both forms can be seen in (4): (3) a. muž 'man': muži~mužema b.
město 'town': městy~městama As Table 1 indicates, actual endings vary quite a bit, but can always be distinguished on the basis of whether the ending contains the sequence -ma (full ending -ama, -ema, or -ma) or not (-y, -i, -ami, -emi or -mi). For simplicity we will collectively refer to those as the -ma and non-ma endings. Cummins (2005) provides a useful overview of the historical causes of that situation, from the emergence of -ma endings in dialects of Czech in the sixteenth century (a reanalysis of an old dual ending), through its condemnation by early normative grammars and nineteenth century language revivalists, to its role in the codification of Literary Czech and Common Czech in the twentieth century. The alternation between -ma and non-ma forms is clearly sociolinguistically conditioned. The -ma form is felt as informal, unexpected in writing, and frowned upon in school; it is not listed in most resources providing declension tables, including the Internetová jazyková příučka maintained by the Czech Language Institute of the Academy of Sciences of the Czech Republic. 11 On the other hand, the non-ma form is felt as formal, bookish in speech, and the preferred form in schooling. Cvrček et al. (2010) labels the former as spoken forms and the latter as written forms.  The distribution of overabundant instrumental forms the SYN corpus follow the pattern that one would expect given these general observations. Remember that this is a corpus of edited text, comprising press, nonfiction books, and literature. This leads to four expectations. First, we expect -ma forms to be rare in that corpus. This is clearly borne out: the overall token frequency of -ma forms in the corpus (73,255) is two orders of magnitudes lower than than of non-ma forms (17,273,831). Second, we expect most lexemes to be found much more frequently with non-ma forms. Again, this is clearly borne out: only 3% of lexemes use the -ma form more than 1% of the time, leading to an L-shaped distribution of the proportion of use of the -ma forms, shown in Figure 3, that contrasts sharply with the U-shaped distribution found in the locative singular (see Figure 1).
Third, we expect the proportion of use of -ma forms to correlate with lexeme-level sociolinguistic properties: lexemes that are more likely to be used in an informal context are also more likely to be used in a -ma form. Testing this prediction in detail is beyond the scope of this paper. However it is striking to look at the few lexemes with more than 1000 attestations and a proportion of use of the -ma forms above 10%. Of these 11 lexemes, 5 are frequent colloquialisms (kluk 'boy', holka 'girl', chlap 'man', ženská 'woman', kámoš 'friend'), two are polysemous terms whose relevant attestations have a colloquial secondary meaning (prášek 'powder', also colloquial term for 'pill'; koza 'goat', also colloquial term for 'breast'), two refer to concepts overwhelmingly discussed in informal settings (chlup 'body hair', škvarek 'greaves'), one is a false positive (schod 'step': the vast majority of attestations of schodama are in the collocation Galerie pod schodama, litt. 'gallery below the steps', a fixed proper name). In the end, chvtlka 'short moment' is the only case where the higher proportion of -ma forms does not obviously relate to a lexically-conditioned restriction to informal contexts.
Fourth, we expect the proportion of use of -ma forms to correlate with textual genres. Our reference corpus gives us limited access to relevant information in the form of a broad classification of texts. The breakdown is shown in Table 3. As one might expect, literary texts gives rise by far to the highest proportion of -ma forms, as these may contain dialogue and/or writing in an informal or speech-like style. Nonfiction, which in this corpus consists mostly of academic writing, is at the other end of the spectrum. Press stands in the middle, with magazines more informal than daily newspaper.   Overall then, the broad distribution of instrumental plural forms in the corpus confirms the observations from the literature.

Summing up
After establishing that overabundance is highly prevalent in Czech nominal declension, we have focused on two particular cases that contrast in multiple dimensions. In the locative singular of hard masculine inanimate nouns, a minority of nouns are overabundant, while in the instrumental plural all nouns are overabundant. Proportions of use of the two forms in the corpus follows a U-shape for the former, and an L-shape in the latter. This is linked to the fact that the choice of form in the instrumental plural is clearly subject to sociolinguistic conditioning, while this is not obviously the case in the locative singular.
In the remainder of this paper we turn to our main topic: how does overabundance interact with inflectional classification? In section 2, we examine the predictability of overabundance: we show that, in the locative singular, the overabundant character of a noun is predictable from its stem shape and distribution, while this is not the case in the instrumental plural. In section 4, we examine the relationship between syntactic usage and overabundance: we show that overabundant locative singular nouns exhibit singular properties that are not found with their non-overabundant counterparts. Both studies lead to the conclusion that some, but not all, overabundance phenomena should be treated in terms of the postulation of a specific overabundant inflection class.

Predicting overabundance
A well-established property of inflection class systems is that they tend to be partially motivated: while the postulation of inflection classes is justified by the fact that it is not strictly predictable which lexeme will belong to which class (Aronoff 1994), there are typically striking correlations between inflection class assignments and phonological, (morpho)syntactic, and semantic properties of lexemes. This is evident in the traditional description of the Czech declension system above, where stem phonology and grammatical gender were seen as partial predictors of inflection class, with grammatical gender itself being partially predicted by semantic properties such as animacy and social gender.
In this section we rely on this property to explore whether classes of overabundant lexemes should be considered to constitute inflection classes. The reasoning is the following: if the existence of variation (vs. absence of variation) between two exponents for a particular lexeme can be partially predicted from examination of the lexeme's stem phonology, this counts as evidence for this lexeme belonging to a distinct inflection class, as this is the behaviour that is usually seen for non-overabundant classes. We will examine three classes of predictors: aspects of the phonology of the stem, aspects of the distribution of the word in a corpus, and, where relevant, grammatical gender.
Analogical classification consists in finding the class of some new item, based on the surface similarity of that item to other items whose class is known. The basic idea is that items that look similar on the surface belong to the same class (Blevins, Milin & Ramscar 2017).
From a computational perspective, there are several different techniques one could use for analogical classification. Although these have considerable mathematical differences, and may better or worse performance on different types of data, the final product is conceptually the same: an analogical classifier sees a set of lexemes and their class, and tries to learn the regularities in the surface form of those lexemes which best correlate with that lexeme's class.
In this study we make use of Extreme Gradient Boosting Trees with the package XGBoost (Chen & Guestrin 2016). A boosting tree classifier fits many weak tree classifiers (similar to Guzmán Naranjo and decision trees) and then combines them to form a stronger classifier. The principle is similar to that of Random Forests (Breiman 2001), but while a Random Forest fits many small classifiers randomly, a boosting tree classifier fits many small tree classifiers in a guided manner trying to achieve the best accuracy possible.
Our choice of classification method is purely pragmatic. Alternatives like Analogical Modeling (Arndt-Lappe 2011;Skousen 1989;, or the Minimal Generalization Learner (Albright & Hayes 2002;2003;Albright 2009), or other machine learning frameworks such as neural networks (Bechtel & Abrahamsen 2002;Churchland 1989;Rumelhart et al. 1986), could be used to the same effect. On a practical level though, boosting trees have several advantages. The main advantage is that boosting trees of this kind can easily handle the type of data in this problem, i.e., categorical predictors with a large number of different levels, while at the same time being computationally efficient. Simpler models like logistic regression tend to over or underestimate the importance of low frequency levels in this kind of data.
Because we want to know whether the model can predict new items instead of just remembering the items it has seen during training, we perform ten-fold cross-validation on every model. This is done by first splitting the dataset into ten groups. The general model is then fitted using nine of the groups as training data, and testing the predictions of the model on the group not used for fitting it. The process is repeated for each of the ten subgroups. This way prediction on all the datapoints is examined while preventing any kind of overfitting (Kohavi 1995).
Although it is possible to look inside the models and see what each predictor is doing with respect to the output classes, this is a tedious process that is not crucial to our purposes in this paper. We are more interested in knowing how well we can predict the inflection classes of the items, rather than exactly knowing which segments correlate with which classes and how.
Instead, we focus on three metrics to evaluate the models: accuracy, no information rate, and kappa score. These metrics are calculated based on a confusion matrix of the model. As an example consider the fictional confusion matrix in Table 4, exhibiting the performance of a classifier on a dataset of 67 items belonging to three classes A, B, and C. The confusion matrix compares predictions of the classifier to the actual, reference classification, by indicating how many members of each actual class (in columns) were predicted to belong to which class (in rows). So for instance, the table reports that, among the twelve items that are truly members of class A, 10 were correctly classified, while 2 were incorrectly classified in B and 1 was incorrectly classified in C.
Accuracy is the number of correct predictions (the sum of the numbers in the diagonal) divided by the total number of items: in our example the accuracy is 51 67 0.76  . The No Information Rate (NIR) is equal to the proportion of the data that belongs to the largest class: it is the best guess one could make in the absence of any predictive information. In our example the largest class is class B with 28 members, hence the NIR is 28 67 0.42  . Comparing the accuracy and NIR is crucial to assessing performance: the same accuracy value may be very impressive if it is much higher than the NIR, or not at all if it is close to (or even smaller than) the NIR. For a statistically meaningful comparison, we report a 95% uncertainty interval for the accuracy value, 12 which reflects uncertainty about the estimation of accuracy related to the size of the 12 We calculated all uncertainty intervals with a Bayesian Binomial model with mildly informative priors, using Stan (Carpenter et al. 2017;Gelman, Lee & Guo 2015) and the brms interface (Bürkner et al. 2017). Uncertainty intervals (also called credible intervals) are similar to confidence intervals but their interpretation is more straightforward and intuitive: the 95% uncertainty interval is the interval within which the value of a parameter of interest (in this case the accuracy of the classifier in the whole population) falls with 95% probability.  dataset: for a given accuracy value, the larger the dataset, the smaller the uncertainty interval. In our example, the uncertainty interval for accuracy is (0.64, 0.84): hence while we should not be confident about the accuracy value up to a percentage point, we can be confident that it is higher than the NIR, that is, that the classifier performs better than chance.

Reference
The kappa statistic gives a value between 0 and 1 measuring the performance of a classifier by comparing the observed accuracy with the expected accuracy (under random chance). The reason for using kappa in addition to raw accuracy, is that accuracy can be skewed in cases with unbalanced classes. In our example, the kappa is 0.64, indicating good though by no way perfect performance of our classifier. 13 Since our aim is not to test a hypothesis regarding any specific predictor, we do not perform any sort of significance testing. The evaluation metrics we use tell us how well the model performs as a whole, and not whether any specific predictor had a measurable impact. Comparing the observed accuracy to the no information rate lets us know that our model is performing above simply guessing the largest class, and the kappa statistic tells us how much better than random chance our model is doing.

Predictors
Our goal is to assess whether and how a lexeme's inflection class can be predicted from other properties of that lexeme. In this paper we use two types of predictors: stem phonology, and distributional vectors.
There are many different aspects of stem phonology that could be used as predictors for our classifiers, and many different ways of coding them up. For instance, we could imagine that identification of initial or final segments, initial or final syllables, word length, or the makeup of the word in terms of biphones or triphones (Baayen, Chuang & Blevins 2018) are possibly relevant. In this paper we take a pragmatic approach to the issue, and rely on prior knowledge of the Czech system to guide a choice of simple predictors. First, we rely on orthography rather than an explicit phonemic transcription. This should not lead to any major loss in accuracy, given that the grapheme-to-phoneme relation is fairly transparent in Czech. 14 Second, as segmental predictors we only use the three last characters of the orthographic stem. This is certain to capture the expected main effects of final consonants, and keeps the number of predictors at a manageable size. Note that stems were obtained by cutting off case-number suffixes as documented by Cvrček et al. (2010) from the words under examination. As a result, the stem allomorph used in the word under examination was considered, rather than the stem allomorph of the citation form, where these differ. This should have no major effect on the results, since stem allomorphy is fairly limited in Czech. Finally, in addition to segmental predictors, word length in syllables was approximated by the number of vowels in the stem. Again, this is a fairly reasonable approximation, as diphthongs are not very prevalent, and no vowel is coded in the orthography as a digraph.
In addition to stem phonology, we used distributional vectors to provide information about the context of use of words of interest. Distributional vectors provide a multidimensional representation of the distribution of words in a corpus, such that words with a similar distribution have similar vectors, and different dimensions of the vectors represent different aspects of distributional similarity. Advances in corpus size, computing power, and inference algorithms have made distributional vector spaces a standard tool of the trade in computational linguistics, allowing various systems to take into account lexical properties in a generic manner (Camacho-Collados & Pilehvar 2020). In the context of general linguistics, 13 In the following, all metrics are calculated on the aggregated results of all cross-validation steps.
14 Rare opacities result from recent borrowings whose orthography was not adapted, e.g. e-mail, pronounced [iːmɛjl] instead of the expected [ɛmajl]. Note that Czech orthography makes use of digraphs (for instance ch notes [x], and palatalization of consonants is often noted on the following vowel), but this does not lead to opacity in the grapheme-to-phoneme direction. Also note that there is a significant amount of opacity in the phoneme-tographeme direction, most prominently because of the use of the two letters <i> and <y>, which note the same sounds (short [ɪ] or long [iː]). After some consonants, <i> indicates palatalization of the preceding consonant, but this is not systematic. As a result there are many pairs or words that are orthographically distinct but phonetically undistinguishable, such as masculine and feminine plurals of past verb forms, e.g. mluvili 'they (masc.) spoke' vs. mluvily 'they (fem.) spoke'. This is not of concern to us here as we are approximating phonology by orthography rather than the other way around.

Guzmán Naranjo and Bonami
Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1626 distributional vectors are typically used as a way of approximating lexical semantics, in accordance with the distributional hypothesis (see Boleda 2020 and references therein), according to which words with similar distributions are semantically similar. In particular, a growing subliterature uses distributional vectors to study the semantic effects of derivational morphology (see e.g. Marelli & Baroni 2015;Varvara 2017;Lapesa et al. 2018;Huyghe & Wauquier 2020). However, in the present context, it is important to remember that lexical semantics stricto sensu is only part of what distributional vectors capture. In particular, when using vectors based on wordforms, morphosyntactic contrasts such as grammatical gender or case government will have an effect on what the vectors look like (Bonami & Paperno 2018): for instance, the grammatical gender of nouns will be coded by distributional vectors, even where it has no semantic reflex, because gender will trigger agreement and hence a different distributional environment for the noun. Likewise, sociolinguistic contrasts between lexemes are likely to lead to contrasting vectors, as words used by different speakers in different circumstances are likely to co-occur with other words subject to the same sociolinguistic restrictions.
For the purposes of this paper, we derived a 300 dimension distributional vector space from version 4 of the SYN corpus also used for all other aspects of our study. We used the Gensim (Řehůřek 2010) implementation of the SkipGram variant of the word2vec algorithm (Mikolov et al. 2013), using the following hyperparameters: 9 training epochs, 20 negative samples, window size 20. Importantly, our vectors are based on lexemes rather than individual wordforms: we used the lemmatization provided with the corpus to derive a version of the corpus where individual words are replaced by their lemmas, and then built the vector space on the basis of that version of the corpus, abstracting away from inflectional variation. This is appropriate in the present context: we hope our classifiers to be able to predict whether a lexeme is overabundant, and overabundance is inherently a property involving multiple wordforms, so that it would make little sense to predict that from properties of a single word. However, it is important to keep in mind that a side effect of this decision is to eliminate part of the distributional variation. For instance, the effects of grammatical gender on vectors for nouns will be dampened, as different forms of agreeing adjectives and verbs will be lumped into a single lemma; on the other hand, broader consequences of semantically-motivated gender assignment leading to collocation with different content words may still be captured. Likewise, the vectors cannot capture directly distributional differences between forms of the same lexeme typically used in a formal vs. informal context (as is expected for the contrasting instrumental plural forms), as these will be mapped to the same vector; but they can still capture differences between lexemes that are on the whole used more in collocation with other lexemes that are markers of formality or informality. Table 5 shows the type frequency of hard masculine inanimate nouns attested in the corpus at least 20 times with the -u ending, the -ě ending, or both. Two remarks are in order about these figures. First, there is a strong imbalance between classes, with the -u class an order of magnitude larger than the other two. This is a problem for modeling: if classes are too imbalanced, the model will tend to rely on raw frequency rather than predictor variables to make predictions. Second, for lexemes only found in one of the two forms, we cannot be certain that the other form is impossible. The likelihood of such errors is high, given that, as we saw in Figure 1, most overabundant lexemes have a strong preference for one or the other variant. To take an extreme example, if the true proportion of use of -u for a lexeme is 90% and we have only two occurrences in our corpus, there is an 81% probability that both will be in -u, despite the fact that the lexeme is overabundant. To mitigate these two problems, we selected the 600 most frequent lexemes for each class.  We fit three distinct models to this dataset: a model with just the phonological predictors, collectively labelled 'shape' predictors, a model with just the distributional vectors as predictors, and a model with both. Table 6 reports the performance of the three models. Note that the No Information Rate (NIR) is 1 3 in all cases, as by design the three classes have the exact same type frequency of 600.

Hard masculine inanimate locative singulars
The overall observation is that all three models perform considerably better than chance, although they do not reach a spectacular level of accuracy. Hence it is clear that assignment of lexemes to one of the three classes is not fully arbitrary. It does not look to be fully predictable either-at least the predictors used in this study are far from ensuring fully accurate prediction. We are thus in the typical grey zone of inflection class assignment being partially predictable.
The 'shape only' and 'distribution only' models reach comparable levels of performance. Combining the two sets of predictors leads to a barely measurable increase in accuracy compared to having one set of predictors only. These observations strongly suggest that, while phonology and distribution both contribute to predicting inflectional behavior, they do not tend to make complementary contributions where one set of predictor helps when the other fail.
To get a more detailed look at what is going on, we now examine the confusion matrix for the combined model, shown in the left hand part of Table 7. Two observations are in order here. First, performance is highest on the -ě class (95% correctly classified), followed by the -u class (88%), followed by the overabundant class (76%). This suggests that lexemes forming their locative in -ě only are more cohesive in their phonological and distributional properties than those that can or must use -u. Second, most of the confusion arises between the overabundant class and the two other ones: there are very few situations where the model wrongly assigns -u as a unique exponent instead of -ě (<1%) or the other way around (0%). The model also rarely assigns to the overabundant class a lexeme found only with -ě in the corpus (<4%). However, about 25% of lexemes found only with -u in the corpus are wrongly assigned to the overabundant class; and a sizeable subset of undisputably overabundant lexemes are wrongly associated by the model with only -ě (2%) or only -u (22%).
Examination of the confusion matrices for the two other models reveals a broadly similar picture. Only two differences are worth mentioning. First, the two kinds of predictors seem to differ in how they deal with lexemes that are truly overabundant (middle column): the model based on phonology alone has more of a tendency to conclude that they are instances of -ě only, while the model based on distributional vectors alone has more of a tendency to conclude that they are instances of -u alone. The combined model manages to build on both kinds of predictors to achieve better performance on this part of the dataset.
How can we explain the patterns of errors we just observed? For this we must distinguish errors on the middle row from errors on the middle column. On the middle row, the errors correspond to cases where the model predicts a lexeme to be overabundant, while it is found    only in one form in the corpus. As we discussed above, that situation is likely partly due to sampling accidents: by chance, some lexemes that are truly overabundant are only found with one of their two forms in the corpus. Unfortunately, there is no direct way of testing what proportion of the errors is due to such accidents: we just do not have a larger sample to make that evaluation. Importantly however, such an explanation does not hold for items in the middle column: for these we do have attestations for both forms, and hence have no hesitation as to what class they belong to. Hence the fact that there is a nontrivial amount of error here is revealing on the nature of the system. We submit that this pattern justifies seeing the relevant class of overabundant lexemes as a mixed inflection class: an inflection class that is distinct from both the -u class and the -ě class, but that still has properties that are intermediate between those of its two corresponding single exponent class. This is not a new idea: see in particular Beniamine (2021), Bonami & Crysmann (2018) and Guzmán Naranjo (2019) for different takes on overabundant inflection classes as mixes of other classes. What is specific to this study is that we argue for this mixed inflection class status on the basis of partial motivation: overabundant lexemes stand between two inflection classes in terms of predictability of their inflectional behavior from their phonological and distributional properties.

The complete system of locative singulars
We now turn to an examination of the complete system of 37656 lexemes attested at least 20 times in the locative singular. The full set of exponents that we expect to encounter is as indicated in (5). c.
No exponent with undeclinable nouns.
As the description suggests, there are multiple situations of potential overabundance: between -u and -ě, -u and -ovi, -i and -ovi for ordinary nouns; between -ém and -ým, -é and -ý for converted adjectives; and finally, a small number of nouns exhibit fluidity between genders, hard or soft status, or declinable vs. undeclinable status. As a result, we find evidence in the corpus for 8 non-overabundant behaviors as well as 12 overabundant behaviors, as indicated in Table 8.
A detailed analysis of this complex and heterogeneous dataset is beyond the scope of this paper. However, it is worth examining the overall performance of a classifier applied to this 20 class system. We thus fit classifiers with similar characteristics to those discussed in Section 3.2.1, with two differences. First, we did not perform any type frequency normalization, as the smaller classes just do not have enough members for that to be possible. And second, we also included gender as a predictor, which was irrelevant as long as we were focussing on a subclass of masculine inanimates. Taking into account all possible combinations of three sets of predictors gives us seven models in total. Table 9, the performance of these classifiers is remarkably high, given the number of different classes. First, any of the three sets of predictors is highly relevant on its own, moving the accuracy from a baseline of 0.24 to at least 0.56. Second, when taken separately, stem shape is clearly the most relevant predictor, followed by gender and then distribution. Third, shape 16 Guzmán Naranjo and Bonami Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1626 and gender together allow a very high level of predictability, with a classification accuracy 0.85. This is what we expect given the traditional description of the inflection class system. However, the addition of distribution to these two predictors still allows for a measurable 0.3 increase in accuracy. This is a strong indication that either lexical semantics or other lexical characteristics reflected in distribution do contribute to predicting inflectional behavior. Overall, the performance of analogical classification on this intricate system of 20 classes confirms inflection class assignment to be highly, although not categorically, predictable.

As indicated
Returning to overabundance, we confirm on a larger scale the results already highlighted with masculine inanimates. The full 20 × 20 confusion table for the best classifier can be found in the appendix. However that table is quite hard to read given the number of classes and the diversity of errors made by the classifier. Instead, we extracted from this table the numbers corresponding to the three major overabundant behaviors, with a type frequency above 200 in the corpus. These are shown in Table 10. As the reader can check, we see again that there is very little confusion between the two classes with a single exponent, whereas there is a sizable amount of confusion between the overabundant class and each of the other two. Hence in all three cases, the classifier is very efficient at distinguishing two classes of lexemes using a single exponent; it also identifies a mixed class (since presence in this class is predictable above chance), but has a harder time distinguishing overabundant lexemes from non-overabundant ones. 15 15 In the full table, all three-way comparisons between an overabundant class and the two corresponding nonoverabundant ones lead to qualitatively identical results, except for -é~-ý and -ém~-ým, which illustrate the same type of sociolinguistically-conditioned overabundance discussed in greater detail for the instrumental plural below, inherited by converted adjectives from their adjectival source.

Instrumental plurals
We now turn to the instrumental plural. Grammatical descriptions lead us to expect finding the following set of exponents.
b. Nouns converted from adjectives: (i) In formal contexts: -ými for hard nouns, -mi for soft nouns. (ii) In informal contexts: -ýma for hard nouns, -ma for soft nouns.
c. With undeclinable nouns: (i) No exponent in formal contexts.
Given that there are 7 formal exponents, 4 informal ones, and only one informal strategy corresponding to each formal one, we expect a maximum of 11 behaviors involving a single exponent and 7 involving two, for a maximum of 18 possible classes.    one exponent in the corpus. We find that for all 7 formal exponents, but only the most frequent of the 4 informal exponents, namely -ama. This is unsurprising: given the general makeup of our corpus, it is unlikely that a lexeme will be found only with an informal variant. The middle part of the table lists lexemes found in each of the 7 expected combinations of an informal and an formal variant. Finally, the right-hand part lists the few unexpected situations of overabundance, due to hesitations on gender, softness, or declinability. These make up less than 0.4% of the dataset under examination.

-ě -ě~-u -u -u -u~-ovi -ovi -i -i~-ovi -ovi
Given the general description of overabundance in the instrumental plural above, we do not expect these cases of overabundance to interact with the inflection class system: whether a noun is found in one or two forms in the instrumental plural depends on whether that noun is found in the corpus in both formal and informal contexts. Although this might be predictable to some extent from the noun's distribution, as there are distributional cues to formality, we do not expect gender or stem phonology to have any predictive power.
To test for this empirically, we fit three separate types of models. The first series of models mimics those shown in Section 3.2.2 for the locative singular, and attempt to predict each of the 19 inflectional behaviors found in the corpus from various combinations of stem shape, gender, and distribution. Accuracy is reported in Table 12. Clearly accuracy is a lot lower than in the locative singular, although it is clearly above chance, and each predictor does make a contribution when added to any other combination of predictors.
Examination of the full confusion matrix is crucial to making sense of these numbers. Table 13 shows the confusion matrix for the most accurate model, with rows and columns arranged so that each expected overabundant class is next to the class corresponding to a single, formal exponent. It should be clear from the table that the vast majority of the errors are due to the model being unable to predict whether a lexeme will be found with only a formal exponent (e.g. -y) or also with the matching informal exponent (e.g. -y~-ama): the model seems to be quite accurate at predicting which formal and which informal exponent can be used with a given lexeme, but quite inaccurate at predicting whether multiple forms are attested in the corpus.
To confirm this, we ran two other series of models that aim at separating these two aspects of prediction. First, we constructed a dataset that neutralizes the effects of overabundance by lumping together lexemes found with only one formal exponent and those found with that exponent and the matching informal exponent. For this experiment we dropped the cases of erratic overabundance documented on the right hand side of Table 11, as these cannot be naturally grouped with other classes. Table 14 reports the accuracy of the models, and the confusion matrix for the most accurate model can be found in the appendix.
We find that all models including stem shape as a predictor are very accurate; gender and distribution also have predictive value, although the strength of gender as a predictor is not as strong as one might have expected. Be that as it may, the high level of accuracy reached by this model confirms that predicting which pair of exponents are available for a given lexeme in the instrumental plural is not hard, while it may be hard to predict whether the two members of the pair are attested or just one of the two.  Our final series of models tests exactly that. This time, instead of grouping lexemes in terms of which formal exponent they may take, the lexemes were grouped according to whether they are found in the corpus only with an exponent of the formal family or with both types of exponents. Accuracy of the relevant models is reported in Table 15, and the confusion matrix for the most accurate model is in the appendix. The performance of this family of models is much lower than the previous one, despite the fact that the number of classes is lower. In addition, gender and shape have no predictive power whatsoever, although distribution does have some.
This result gives a strong confirmation to the hypothesis that overabundance in the instrumental plural is orthogonal to the inflection class system, and uniquely conditioned by sociolinguistic factors. As we suggested above, different lexemes have, either because of their lexical semantics or axiological import, different likelihoods of being used in an informal context, and formality levels are expected to be reflected in a word's distribution, inasmuch as that word's syntagmatic neighbours are subject to the same usage effects. Hence the sociolinguistic conditioning hypothsesis does predict that a lexeme's distribution should be predictive of whether it is found in the corpus with informal exponents. On the other hand, potential lexical predictors that are orthogonal to  1  78  11  1  0  49  19  59  59  40  15  4  2  1  8   -y~-ama  435  1177 1  0  25  40  0  0  5  7  3  19  2  0  0  1  0  1   -ami  0  1  1395 561  0  0  90  30  24  7  11  0  26  5  2  0  3  7   -ami~-ama 0  0  447  1111 0  0  36  84  13  34  5  0  4  5  0    formality distinctions, namely stem phonology and gender, do not have the predictive power we would have expected if overabundance was integrated in the inflection class system.

Taking stock
In this section we developed a sustained argument to the effect that overabundance in the locative singular and instrumental plural interact in different ways with the inflection class system: in the locative singular, there are distinct classes of overabundant lexemes, and these classes are mixed inflection classes; 16 in the instrumental plural, overabundance is orthogonal to the inflection class system, and fully conditioned by sociolinguistic factors.
The exact nature of overabundant locative singular classes remains somewhat confusing at this point: we have argued that they are first class citizens of the inflection class system, but that, in terms of class predictability, they mix and match properties of pairs of other classes. In the next section we develop an independent argument to the effect that overabundant locative singulars have properties of their own, irreducible to those of the neighboring non-overabundant classes.

Motivation
The preceding section has shown that overabundance in the locative singular integrates with the inflection class systems. In the case of hard masculine inanimate nouns, we showed that it was partially predictable on the basis of stem phonology and distribution which nouns are overabundant, and further showed that overabundant nouns occupied an intermediate space between '-u only' and '-ě only' nouns in terms of motivation: on average they share properties with both, which makes them easily confusable with both. We concluded that we should think of these overabundant nouns as belonging to a mixed class. We then generalized this result to other cases of overabundance in the locative singular.
In this section we explore in more detail the nature of mixed classes, and start with a quick review of relevant theoretical concepts in the literature. There is a consensus across frameworks in theoretical morphology that inflection class systems should be conceptualized as inheritance hierarchies, where classes may have different levels of specificity, and more specific subclasses inherit properties of their less specific superclasses (see among many others Corbett & Fraser 1993;Dressler & Thornton 1996;Koenig 1999;Beniamine, Bonami & Sagot 2017). Beniamine (2021) further argues that inflection class systems are best modeled as monotonous multiple inheritance hierarchies of the kind familiar from Head-Driven Phrase Structure Grammar (Pollard & Sag 1994). 17 Under this view, class systems are not trees, but lattices, where a node may have more than one parent. Beniamine's central argument rests on the existence and pervasiveness of heteroclite CLASSES, which have an inflectional behavior intermediate between those of two other classes (Stump 2006). 18 Czech neuter nouns of the class of kuře 16 Except in the case of derived adjectives, which exhibit the same properties found in the instrumental plural.
17 Although, to the best of our knowledge, this was never discussed in print, this is also implicitly the position adopted in early versions of Paradigm Function Morphology (Stump 2001), where any collection of lexemes may count as an inflection class.
18 Ironically for present purposes, Stump (2006) uses the Czech masculine inanimate noun pramen 'spring' as his primary example of heteroclisis, while closer examination shows that this is an instance of overabundance instead.    classes should be analyzed as joins in an inflection class hierarchy. This is illustrated in Figure 5. The intuition here is that overabundant lexemes belong to a class that underspecifies the distinction between the two corresponding non-overabundant classes, and is the mirror image of a heteroclite class. The way this is captured is by associating the inflection rules respectively introducing the suffixes -u and -ě to the two more specific classes, and assigning lexemes to classes as indicated in the figure. The architecture of IbM then ensures that overabundant lexemes will be compatible with both exponents, because any concrete use of an overabundant lexeme has to pick one of the subtypes of the overabundant class. 19 On the other hand, both Guzmán Naranjo (2019) and Beniamine (2021) propose that overabundant classes be analyzed as meets in an inflection class hierarchy. This is illustrated in Figure 6. The two authors converge on this solution for different reason. For Beniamine, this is a consequence of deriving the inflection class lattice from individual properties of exponence exhibited by lexemes using formal concept analysis (Ganter & Wille 1998): in this framework, meets represent shared features, while joins represent the absence of features. For Guzman Naranjo, it derives from the decision that meet nodes in the hierarchy inherit all inflection strategies exhibited by their parents. Note that, under this line of analysis, heteroclite and overabundant classes are both represented by meet nodes in the hierarchy, but contrast as to whether the two parents of the meet contribute complementary or competing inflectional strategies.
It is worth noting that, although they are conceptually distinct, both views of overabundant inflection classes are compatible with our observations on the motivation of mixed classes: in both cases, we expect overabundant classes to exhibit intermediate behavior between their nonoverabundant counterparts, and a higher confusability between overabundant classes and the others than among non-overabundant classes.
There is however one area in which the two approaches make different predictions. Under the join approach, overabundant classes can't have positive properties that are not to some extent shared by their single exponent counterparts. This is a consequence of the monotonous flow of information in the inheritance hierarchy: a higher node in the hierarchy can't have properties that are not shared by its descendants. By contrast, under the meet approach, nothing precludes overabundant classes from having some idiosyncratic properties not shared by higher nodes.
· · · · · · Figure 5 Overabundant classes as joins in an inflection class hierarchy (Bonami & Crysmann 2018). · · · · · · 23 Guzmán Naranjo and Bonami Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1626 In this section we document exactly one such situation: we show that prepositions governing the locative exhibit differential preferences for one or the other exponent of overabundant nouns, a behavior that is not paralleled with non-overabundant nouns. Hence overabundant nouns have irreducible properties, which is not compatible with the join approach to overabundant classes.

Hypothesis: Prepositions exhibit preferences for exponents of overabundant lexemes
In the previous sections we established that there is a class of Czech masculine inanimate nouns that can take both the -u and the -ě ending in the locative singular. From this it does not follow that the choice of one or the other is entirely free: as discussed at length by Thornton (2019b), there can be both usage and grammatical conditions on overabundance. The existence and strength of such conditions is a recurring topic in the description on the Czech locative singular, reviewed by both Cummins (1995) and Bermel & Knittl (2012a;b). While no factor or combination of factors comes close to predicting categorically which of the two exponents will be used in what context, anecdotal evidence can be found for influences of noun polysemy (different senses of a noun having different preferences) and preposition polysemy (different senses of the governing preposition leading to different preferences), as well as individual idiosyncratic preferences of particular (preposition, noun) collocations. Bermel & Knittl (2012a;b) provide more compelling evidence from corpus and judgment data that types of syntactic environments have an influence: all other things being equal, -u is most preferred where the preposition heads a locative adverbial, and least likely when it is an empty preposition governed by a verb.
Elaborating on this literature, we study collocational preferences between prepositions and case-number exponents for overabundant nouns. We start from the observation that different governing prepositions seem to have different preferences as to which locative exponent is used. Table 17 shows the distribution of the two locative singular forms of the two nouns most 'bridge' and úřad 'office' in the SYN corpus, when they are immediately preceded by one of the five main prepositions governing the locative: na 'on, at ', o 'near, about', po 'towards, after', při 'at, around' and v 'in'. Three important observations are in order. First, and unsurprisingly, different lexemes have different preferences in terms of combinability with prepositions: for instance úřad 'office' is much more likely to be found in combination with v 'in' than most 'bridge'. Second, proportions of use of the two variants varies widely accross lexemes (most has a preference for the -ě form, while úřad prefers the -u form), and across combinations of lexemes and prepositions: at one extreme, the -u for of most is used 6% of the time with po; at the other extreme, the -u form for úřad is used 98% of time with při. But third and most importantly for us, despite lexeme-dependent variation, there still seem to be tendencies as to which prepositions prefer to co-occur with each exponent: across these two nouns, o and při have a strong preference for -u when compared to the other three prepositions na, po and v. Our goal in this section is to establish whether such contrasts are general.

Model 1: Predicting exponent preference from preposition for overabundant nouns
To this end, we collected from the SYN corpus all 27,768,583 occurrences of a hard masculine inanimate noun in the locative singular immediately preceded by a preposition. We then tabulated how many tokens of each inflectional variant (-u vs. -ě) was found for each of the 21,830 relevant lexemes in combination with each preposition. Then, among the 1733 overabundant lexemes in the dataset, we selected for study the 481 lexemes that are attested in collocation with all five prepositions under examination.
Our goal is to establish whether the likelihood of using a locative singular in -ě vs. -u varies across governing prepositions. To approach this question we built a Bayesian binomial model using Stan (Carpenter et al. 2017;Gelman, Lee & Guo 2015) and the brms interface (Bürkner et al. 2017). The predicted variable was the proportion of use of -u among uses of a locative singular, and the predictor variable was the identity of the preposition. Figure 7 shows the conditional effects of the model, with whiskers representing 95% uncertainty intervals; the fact that the whiskers are barely distinguishable shows these intervals to be very narrow and hence uncertainty very low.
The model very confidently establishes that each preposition has specific preferences: despite variability among noun lexemes, at the level of the system it is very clear that the -u form is more likely to be used in combination with o and při, while -ě is more likely to be used in combination with na, po and v. 20 This result indicates that overabundant nouns exhibit properties that can only be found with such nouns: by definition, non-overabundant nouns use only one form, and hence cannot exhibit differential exponence properties in combination with different prepositions.

Model 2: Predicting preposition preference from locative singular exponent
It is tempting to see the differential collocational preferences we just documented as a property characterizing the class of overabundant nouns. Before reaching such a conclusion, however, we must eliminate an alternative hypothesis. Above we have reasoned in terms of properties of overabundant lexemes, as opposed to the individual wordforms that realize these lexemes. But it is conceivable that the observed behavior is a consequence of collocational preferences linking 20 Note that our model does not directly take into account the preferences of individual nominal lexemes for -ě vs. -u. Unfortunately, because of strong collinearity between predictors, models using both prepositions and nominal lexemes as predictors on the whole dataset consistently fail to converge. A model based on a smaller sample of 100 overabundant nouns gives results that are qualitatively consistent with what we report here, although the effects are not as clearcut. prepositions and the individual locative singular exponents -ě and -u: perhaps the preposition v is more likely to be collocated with a -ě form than a -u form, irrespective of whether that form belongs to an overabundant lexeme or not. To test for that possibility, we need to examine how likely one is to use each preposition, depending on both the identity of the exponent and whether the lexeme is overabundant or not.
To address this question, we sampled from the dataset described in the previous subsection 400 locative singular noun forms: 100 nouns from the '-u only' class, 100 nouns from the '-ě only' class, 100 -u forms of overabundant nouns, and 100 -ě forms of overabundant nouns. 21 We then built a Bayesian multinomial model predicting proportion of use of each of the five prepositions from the class of nouns. Figure 8 reports the conditional effects of the model, with each subplot corresponding to one of the subclasses of nouns.
Observation of the conditional effects suggests the existence of various tendencies grouping the data in both relevant dimensions. On the one hand, the proportion of collocation with three prepositions (o, po and při) is higher for -u nouns than for -ě nouns, irrespective of whether we are talking about one of the two forms of an overabundant lexeme or the only form of a non-overabundant one. On the other hand, overabundant nouns exhibit a a more or less balanced propensity to combine with na and v, while non-overabundant ones have a marked preference for v.
Although making sense of the details of the distribution is well beyond the scope of this paper, this model clearly establishes that collocation preferences with prepositions are partially predicted by the overabundant or non-overabundant character of the noun, and cannot be solely reduced to preferences of collocation with locative singular exponents.

Discussion
At the beginning of this section we set out to establish whether mixed classes such as that of overabundant hard masculine inanimate nouns should be conceptualized as meets or joins in the inflection class systems. Under the first hypothesis, the mixed class is the superclass of two non-overabundant classes, and should hence exhibit underspecified characteristics: its properties are the disjunction of properties of the non-overabundant classes. Under the second hypothesis, mixed classes share a parent with each of the non-overabundant classes. As such, they will exhibit some properties in common with each of their sister classes, but may also have properties of their own. 21 The nouns were chosen so that each noun was attested in combination with at least three distinct prepositions, and at least 100 times in combination with at least one of these. We then presented evidence for the existence of such properties specific to mixed classes. We documented differential collocational preferences between exponents of locative singular and governing prepositions, and showed these not to be reducible to a more general preference of prepositions for one or the other exponent which would also manifest itself for non-overabundant classes: for instance, although the -u form of overabundant nouns is barely ever used with v, -u only nouns show no sign of reluctance to combine with v. This behavior provides an argument for conceptualizing mixed classes as joins. Irrespective of whether one sees the relevant collocational requirements as following from selectional requirements of the preposition, as reverse-selection of the governor by its governee (Bonami 2015), or as a non-directional phenomenon, the statement of these requirements needs to reference the class of overabundant lexemes without the relevant property being inherited by their non-overabundant counterparts. This is precisely what is allowed by seeing the mixed class as a join node descending from the non-mixed classes rather than a meet node with the non-mixed classes as descendents. In this instance, external motivation, in the form of collocational preferences, provides crucial evidence for the proper internal organization of the inflectional system.

Conclusion
As is the case for many variation phenomena in other areas of grammar, variation in inflectional behavior is largely uncharted territory for linguistic theory. Descriptive and typological efforts spearheaded by Thornton in the last decade have led to a recognition of the widespread character of the phenomenon, and to explicit proposals to accommodate the general phenomenon in formal models of inflection (Stump 2016;Bonami & Crysmann 2018). However the traditional toolkit of formal linguistics is arguably ill-equipped to do justice to the richness of the phenomenon: a statement of variation between inflectional strategies is a correct but blunt approach to the question, which does not capture the gradient conditioning of that variation.
In this paper we attempted to improve the state of the art in this area, both by exploring in depth how alternate inflection strategies interact within a single system, and by relying on quantitative modeling to explore the fine properties of overabundance phenomena. We reached two main conclusions.
First, we established a qualitative difference between two types of overabundance. The Czech locative singular nominal declension exemplifies multiple cases where overabundance is embedded in the inflection class system, with overabundant lexemes forming distinct, mixed classes which contrast with their non-overabundant neighbors. These contrast with the situation found in the instrumental plural, where overabundance is fully orthogonal to the inflection class system: exponents come in pairs, and each lexeme is compatible with a pair of distinct exponents. Our arguments in favor of this conclusion are based on the external motivation of inflection classes. Following previous literature, we started from the assumption that inflection class assignment can be partially motivated by inflection-external (phonological, morphosyntactic, and semantic) properties of lexemes. We then showed that overabundant locative singulars exhibit external properties that are intermediate between those of their non-overabundant counterpart, while no such effect is found with overabundant instrumental plurals.
Second, we argued that mixed classes should be conceptualized as truly intermediate between two non-mixed classes, rather than unspecific or underspecified; technically, they should be seen as join nodes rather than meet nodes in the inflection class hierarchy. Our argument again rests on observations on external motivation, but of a different kind. We showed that, where overabundance provides two different locative singular forms for a hard masculine inanimate noun, the two forms exhibit different collocational preferences with governing prepositions. Crucially, these collocational preferences are distinct from those witnessed with non-overabundant nouns; hence they need to be stated as properties of the mixed class that are not shared with their non-overabundant neighbors, which is contradictory with the underspecification view.
We end by going back to the typology of overabundance phenomena. In this paper we examined exactly two cases of overabundance in just one language. On the basis of quantitative 27 Guzmán Naranjo and Bonami Glossa: a journal of general linguistics DOI: 10.5334/gjgl.1626 evidence from external motivation, we were able to provide a rather detailed account of the commonalities and differences between these two. These two cases are both informative on the overall typology, each in its own way. On the one hand, the locative singular nicely exemplifies overabundance conditioned by morphological factors, as the possibility of overabundance is linked to the structure of the inflection class system, a strictly morphological notion (Aronoff 1994). Morphological conditioning is a possibility that Thornton (2019b: 248) anticipated, but did not provide a clearcut example of. On the other hand, the instrumental plural situation highlights the fact that conditions on overabundance may associate with whole series of exponents rather than individual ones: each Czech noun has two possible forms for the instrumental plural, and the (mostly sociolinguistic) conditions are the same for all nouns, but the identity of the exponents varies depending on the inflection class. This illustrates the deeply paradigmatic nature of the phenomenon of overabundance.
That being said, we make no claim as to typological, or even language internal generality. As a case in point, consider the fact that, in the dataset under consideration, we observed a combination of three constrasts: overabundance in the loc.sg is lexically restricted to some corners of the inflection class systems, whereas it is lexically general in the ins.pl. It is subject to grammatical conditions in the loc.sg and not in the ins.pl; conversely it is subject to usage conditions in the ins.pl and not in the loc.sg. However, we have no reason to assume that the three contrasts align in this way. Detailed examination of the Czech system already provides evidence that things are not so simple. As we briefly commented on in Section 3, in the particular case of nouns derived by conversion from adjectives, we do find sociolinguistic conditioning in the loc.sg that is exactly parallel to what we documented in the ins.pland hence this is a situation of lexically restricted overabundance subject to usage but to no grammatical conditions.
While we make no claim as to typological generality, we submit that the set of computational methods deployed in this paper constitutes a crucial toolkit to explore the typology of overabundance, by providing operational ways of exploring the graded dimensions of this typology that are largely free of language-particular descriptive biases. We hope this paper to be a useful first step in that direction.