1 Introduction

One of the most interesting and difficult questions in research on language lies in formally characterizing the class of possible grammars. One aspect of this challenge asks whether there are constraints on grammars of a general, abstract nature, and in turn, whether these constraints are specific to language or instantiations of even broader, domain-general constraints on cognitive systems, with manifestations observable elsewhere. For example, some progress has been made in syntax on the basis of Formal Language Theory and the Chomsky Hierarchy (Chomsky 1956) for the analysis of sets of string sequences. We aim to contribute to the development of a similarly general perspective for morphology, particularly with respect to morphological features, i.e. the features that underlie the variation in how different concepts are grouped across languages as evidenced by exponence by the same form (syncretism). The architecture of feature-based morphological systems predicts that only certain patterns of variation are possible. In this paper, we address *ABA generalizations from this perspective. We show that a class of *ABA-type generalizations can be derived from the feature-based architecture in conjunction with a minimality assumption. We furthermore argue that such a derivation may be plausible for some cases of an *ABA generalization, but not for others.

The term *ABA generalization refers to morphological patterns in which, given some arrangement of the relevant forms in a structured sequence, the first and third may share some property “A” only if the middle member shares that property as well. If the middle member is distinct from the first, then the third member of the sequence must also be distinct. Bobaljik (2012) demonstrates that a *ABA generalization holds for adjectival suppletion in the sequence positive-comparative-superlative: across a large cross-linguistic sample, one finds ABB patterns such as good-better-best, where the comparative and superlative share a root be(t)- distinct from the positive, but what is not found is an ABA pattern: *good-better-goodest, in which the positive and superlative share a root, distinct from the comparative. Similar *ABA effects have been noted in extensive studies of case syncretism (Caha 2009), suppletion for both case and number in pronouns (Smith et al. 2016), Germanic verbs and participles (see Wiese 2008 on German, and class material cited by Starke 2009 on English), and in other domains.

In one way or another, almost all existing accounts of these generalizations have argued that the *ABA effect arises as a result of nesting or containment relations among features, along with the assumption that linguistic rules are arranged such that a more specific rule takes precedence over (bleeds) a more general one, the so-called Elsewhere or Pāṇinian ordering (Kiparsky 1973; 1979). For the example above, Bobaljik argues that the representation of the superlative properly contains the representation of the comparative, which in turn properly contains the basic form of the adjective, as in (1).

(1) a. Positive: [ADJECTIVE]
  b. Comparative: [[ADJECTIVECOMPARATIVE]
  c. Superlative: [[[ADJECTIVECOMPARATIVESUPERLATIVE]

If a language has a rule of suppletion such as GOODbe(t)- / __ COMPARATIVE, that rule will block the basic root good in both the comparative and the superlative, in virtue of being the most specific rule compatible with the context. Nothing forces the comparative and superlative to share a root – Latin uses an ABC pattern (bonus-melior-optimus) with a distinct root in each of the three grades, but the containment relation in (1) ensures that the ABA pattern is underivable (except as a case of accidental homophony).1

In this paper, we discuss some results of an ongoing project studying the combinatorial properties of rule systems that describe syncretism in morphological paradigms. Although that project did not set out to examine *ABA patterns per se, it turns out that *ABA-like restrictions emerge as a quite general prediction from the assumption that Universal Grammar selects the minimal feature inventories needed to generate a paradigm of a given size. We call this the assumption of Minimality. We present both a general and a narrow version of this restriction. The narrow, more specific prediction arises if we assume that feature intersection is permitted in the formulation of rules of exponence. We believe this narrow result is particularly interesting, since the *ABA restriction emerges without the containment/nesting hypothesis that characterizes other accounts. Intuitively, *ABA emerges when a three-element sequence is the product of two overlapping features and their intersection: in the sequence (“paradigm”) x,y,z, if x and y share a feature, and y and z share a feature, but x and z do not share a feature, then even without a total containment relation among the features, it follows that the patterns ABC, ABB, and AAB are generable, but ABA is excluded. Most of the paper is devoted to showing that this state of affairs is not only formally possible, but is in fact forced in some contexts by plausible minimal assumptions about feature logics. While this approach seems implausible for some *ABA patterns (we think there are good reasons independent of suppletion to assume that superlatives contain comparatives), we wish to bring this to the table as a possible alternative in other instances. The broader result is that the assumption of minimality, with or without feature intersection, and with or without a limitation to Pāṇinian ordering, yields a class of restrictions on the distribution of paradigm types, of which *ABA is a special case.

Although we have identified the *ABA result as an important point of contact with other current theoretical morphosyntax work, a significant portion of this paper will be devoted to presentation of a framework where classes of morphological models can be formally discussed, and where the effects of individual assumptions can be explicitly computed, for example, in terms of their restrictiveness. Alongside the *ABA result, we also discuss the effect of imposing Pāṇinian ordering on feature models, and show that its effects are comparatively weak in certain classes of models.2

2 Paradigms: Partitions, Features, and Sequences

2.1 Paradigms

Paradigms represent information about the pairing of grammatical properties and linguistic forms. Thus, a paradigm can be seen as a list of cells representing an inventory of linguistic forms ⟨x,y,z,…⟩, each paired with a unique property or combination of properties or features (Stump 2016). As already mentioned in the introduction above, we explore an approach where the order that the cells are presented in plays no role in the morphology. That is our proposal applies to any set of cells regardless of whether a one-dimensional linear order of the cells is assumed or some multidimensional arrangement of the cells. For presentational purposes, we assume there is a conventional linear order for a given inventory of cells, and thus for the corresponding paradigms (lists of forms). As a simple illustration, partial case paradigms for selected German pronouns can be given as in (3), where the cases are presented in the order in (2):

(2) NOMINATIVE, ACCUSATIVE, DATIVE,…
(3) a. 1SG: ⟨ich, mich, mir
  b. 1PL: ⟨wir, uns, uns
  c. 3PL: ⟨sie, sie, ihr

The German pronouns further illustrate a property that is central to the study of paradigms, namely, syncretism, the many-to-one mapping from features or properties (like case) to exponents (phonological forms) seen in (3b)–(3c). Where the first person singular pronoun is characterized by a three-way contrast, the first and third person plural forms each show only a two-way distinction in form, corresponding to a three-way distinction in grammatical properties. Describing these patterns as syncretic constitutes a claim that the identity of form is represented grammatically, and is thus distinct from accidental homophony (see Harbour 2008; Sauerland & Bobaljik 2013). In the notation of the previous section, a syncretic pattern ABB, as in (3b), has only two listed forms (A=wir, B=uns), and the grammar codes the fact that the B form occupies the second and third cells of the paradigm. Accidental homophony is the state of affairs where the grammar lists three forms (ABC), but two of those forms simply happen to have the same phonology. Our investigation concerns only syncretism, although we recognize that it is in practice a notorious analytical challenge to identify a sharp dividing line for the analysis of specific data points.3

It is widely held that syncretism is central to investigating the nature and inventory of features as in (2). That the accusative and dative forms of the 1PL pronoun are syncretic suggests that those two cells share a feature, a property not represented in (2).4 In this manner, comparing the range of attested and unattested syncretisms over some sufficiently large sample may reveal the underlying inventory and organization of the relevant features.

When we abstract away from the details of particular forms, the study of syncretic patterns is at its core the study of partitions of a set, a well-defined mathematical concept. The number of distinct partitions for an n-celled paradigm is the Bell number: Bn. For a three-celled paradigm, the B3=5 distinct partitions are listed in (4a). The same information is displayed graphically in (4b) to emphasize that the absolute values (A, B, etc.) are not relevant; all that matters is sameness or difference of cell contents:

(4) a. AAA, AAB, ABA, ABB, ABC
  b.

The Bell number grows very fast. There are B8 = 4,140 logically possible partitions of an 8-cell paradigm, and a 10-celled paradigm space (arguably, the number of cases in Russian, see Corbett 2008) has 115,975 possible partitions. In medium- to large-scale studies of syncretism (Cysouw 2003; Bobaljik 2012; Baerman et al. 2005; Harbour 2016), it is commonly observed that only a subset, often only a very small subset, of the theoretically distinct partitions are attested. For example, Cysouw (2003; 2011) considers a sample of person paradigms drawn from more than 250 languages, characterized as an 8-cell paradigm space,5 but finds only 60-some-odd distinct partitions from among the more than 4,000 logical possibilities. The *ABA generalizations, mentioned above, make the same point: over some sizeable range of data, where 5 patterns are possible, only four are actually found in the world’s languages: AAA, AAB, ABB, and ABC, but not ABA. Typically, studies of syncretism seek explanations for such typological patterns—i.e. develop theories that predict only a subset of partitions to be possible. In what follows, we address exactly this problem but one level of generality higher—we investigate how general assumptions about morphological analysis restrict which subsets of partitions can arise as typological predictions. For example, we show that a restriction to the partition set {AAA, ABB, ABC} cannot be derived solely within our general assumptions, while the *ABA condition can be derived. In this, we hope here to make new contributions in the formal investigation of the restrictiveness of competing models.

2.2 Features

In order to make headway on these issues, we propose to start by presenting a largely theory-neutral means for representing features. Our notation allows us to express any of numerous competing assumptions about features and feature-logics, allowing us to then compare them directly. We start by recognizing that at its most basic, a feature is a name for individual cells or sets of cells in a paradigm. With reference to an n-celled paradigm, we write a feature as f indexed with a binary vector, where 1 indicates the cell or cells that feature names. Thus, one way of naming features that generate a 3-celled paradigm is as in (5), with a unique feature naming each cell.

(5) a. f100
  b. f010
  c. f001

We define a model of a given paradigm as having two components: an inventory of features, and rules of exponence, which relate features to form. Alongside the simple feature inventory in (5), we may state the rules of exponence in (6):

(6) a. f100 ↦ A
  b. f010 ↦ B
  c. f001 ↦ C

Each model is a grammar (fragment), generating one paradigm. In the trivial example just considered, the model comprising (5) and (6) generates an ABC partition—a three-celled paradigm that is maximally differentiated, i.e., in which each cell has a distinct form. Any number of examples of such an approach can be found in morphological descriptions. The description of the German 1SG pronouns in (3a) could be expressed in these terms. Three unanalyzable case features are assumed: f100= “nominative”, f010 = “accusative”, etc., and each one is associated with exactly one exponent, yielding the maximally differentiated paradigm. Analysis of personal pronouns that use three unanalyzed features like “first person” (f100), “second person” (f010), and “third person” (f001) also instantiate this schema.6 Any maximally differentiated paradigm can be expressed in these terms. In fact, from the inventory in (5), the maximally differentiated partition ABC is the only complete partition that may be generated. (By complete, we mean that a phonological form is assigned to every cell of the paradigm space.) Using only rules of exponence of the format in (6), only maximal differentiation is possible, because the cells share no features in common.

But as we have already seen, maximal differentiation is by no means the only way in which an n-celled paradigm space may be partitioned. Thus a feature inventory that is restricted to generating the maximally differentiated partition provides no purchase for an account of syncretic patterns, such as ABB, AAB and the like, other than via accidental homophony.7

Characterizing syncretic partitions such as the ABB pattern seen in (3b) thus requires features that name (contain) more than one cell of the paradigm such as f011. Consider, from this perspective, the inventory in (7), which represents the standard approach to *ABA generalizations in the notation used here:

(7) a. f001
  b. f011
  c. f111

This encodes the same relationship among paradigm cells as (1). One feature is shared by all three cells ((7c)—this constitutes the default), one by two, and one is unique to a single element. On the assumption that the feature inventory in (7) remains constant across languages, but that the rules of exponence may vary from language to language or even from lexeme to lexeme, a variety of different paradigms (partitions) may be generated from this single, shared inventory of features. Rules of exponence for two models sharing the inventory in (7) are given in (8) and (9).

(8) a. f001 ↦ C
  b. f011 ↦ B
  c. f111 ↦ A
(9) a. f011 ↦ B
  b. f111 ↦ A

As the reader may verify, the model in (7) + (8) derives a maximally differentiated, ABC paradigm, one in which each cell is distinct from the others. The model consisting of (7) + (9) derives an ABB paradigm, with syncretism of the last two cells. In these models, rules of exponence are ordered sequentially (read by convention from top to bottom), and disjunctively – the first rule of exponence specified for any given cell must apply to that cell, and once one rule has applied, no other rule may apply. In both (8) and (9), the final exponent (A) is the default – in principle it is compatible with all three cells – but it does not appear in any but the first cell because the rule introducing the default is ‘blocked’ by the application of the more specific rules.

Returning to our German pronoun example, the feature inventory in (7) (unlike the one in (5)) could then support the description of each of the pronouns in (3); the 1SG pronoun using rules of exponence corresponding to (8) and the 1PL to (9). We may represent this outcome more compactly, by listing the features realized by rules of exponence in the feature-based morphological analysis of a given data set as a sequence of features (ordered left-to-right, rather than top-to-bottom for compactness). (10a) represents the ordered rules in (8) as a sequence and (10b) that in (9) ((10c) derives (3c) from the same feature inventory). The string to the right of each sequence characterizes the partition defined by that sequence.8

(10) a. f001f011f111⟩: ABC
  b. f011f111⟩: ABB
  c. f001f111⟩: AAB

The presentation in (10) expresses the fact that several different partitions are derivable from the common feature inventory in (7), by invoking different sequences of features in the rules of exponence.

2.3 Partition Sets

This now gives us the tools we need to introduce the main object of inquiry, namely partition sets:

(11) For any feature inventory I, the Partition Set of I, PSI, is the set of all partitions that may be generated from I.

As we have seen, the partition set of (5) is trivial: PS(5)={ABC}. That is, the inventory in (5) will generate all and only maximally differentiated partitions. The partition set of (7) is more interesting, and (10) represents only a subset. To see this, consider all the sequences generable from (7). Since there are 3 features, there are 3! = 6 (total) sequences to consider, as in (12) (we explain the use of blue font presently):

(12) a. f001f011f111⟩: ABC
  b. f001f111f011⟩: AAB
  c. f011f001f111⟩: ABB
  d. f011f111f001⟩: ABB
  e. f111f001f011⟩: AAA
  f. f111f011f001⟩: AAA

Collecting the derivable partitions, we find that PS(7)={AAA,ABB,AAB,ABC}. Notably, of the B3 = 5 possible partitions of the three-celled space, one is missing: ABA is not included in the partition set of (7).

A different feature inventory may yield a different partition set. For example, adding the default feature f111 to the inventory in (5) renders the inventory unrestrictive: any logically possible partition may be derived. The following partial list of sequences, from among the 4!=24 possibilities, demonstrates this point:

(13) a. f001f010f100f111⟩: ABC
  b. f001f111f010f100⟩: AAB
  c. f010f111f001f100⟩: ABA
  d. f100f111f010f001⟩: ABB
  e. f111f001f010f100⟩: AAA

In this way, we see the general logic that relates typological generalizations to conclusions about features in Universal Grammar. The data we have are the attested partition sets in some domain—the range of partitions that are (un)attested cross-linguistically. The explanans is then the feature inventory: Domains in which the *ABA generalization holds are domains in which one logically possible partition is not attested. The gap is explained if Universal Grammar admits only the feature inventory in (7)—as (12) shows, the unattested partition is not in the partition set of this inventory.

Our aim in this article is to attempt to approach the issues here from the other direction. In the following, we use the notation introduced here to explore the consequences of various kinds of formal restrictions one could conceivably apply to models of this sort (inventories of features and associated rules of exponence). We do so in the first instance entirely in the abstract, with no connection to substantive features or empirical data. Our goal is to better understand some of the formal properties of feature logics, and to compare the ways in which various intuitively plausible assumptions do and do not restrict the combinatorics.

That is, rather than starting with some observed partition sets and attempting to infer the feature inventory (a task for which there are often multiple solutions), we investigate here ways in which general constraints on feature algebras do (or do not) restrict the hypothesis space. In other words, rather than arguing that something like (7) is a plausible feature inventory in some domain because it derives the observed facts, we will look instead for general reasons which may favour an inventory like (7) over other inventories on a priori grounds.

One reason to pursue this exercise is that, as our notation calls attention to, absent any prior assumptions about the content of features, the number of possible features that can be defined grows quickly. For a paradigm of n cells, there are 2n – 1 non-empty features that may be defined. For a three cell paradigm, the 7 definable features are these:

(14) f100
  f010
  f001
  f110
  f101
  f011
  f111

If features could be freely chosen to form inventories, then 128 distinct feature inventories could in principle be constructed from these features (including the empty set). For a four cell paradigm, there are correspondingly 15 features and 32,768 possible inventories to consider. As we have seen above, from each inventory, a number of distinct models can be constructed. That is, each inventory can be mapped to one or more sequences, thus yielding a variety of partition sets. If rule ordering is unconstrained, then from a single inventory with n features, there are n! distinct sequences that may be so constructed.9 The number of possible models (and thus grammars) thus quickly becomes astronomical, and we suggest it is therefore important to ask whether there may be some universal constraints that drastically restrict the classes of possible models to be considered. Thus, we will spend a fair part of the following discussing the combinatorics involved. We will approach this as follows: Using the understanding of features, paradigms, and models outlined above, we will set out to explore in quantitative terms various conditions that may be imposed, and show explicitly how they do and do not restrict the space of possibilities. Many of the numerical results are non-obvious, and we provide the code in on-line supplemental materials for this paper.

Before proceeding further, by way of a brief housekeeping remark, we note that some of the features in some sequences are redundant. The redundant features in (12) are indicated in blue. Because sequences are ordered, once each cell has been assigned an exponent, all further features will have no effect in characterizing the partition. That is, a feature is redundant in a sequence if it is bled by earlier rules of exponence, and thus in principle cannot be exponed. Eliminating a redundant feature from a sequence is indistinguishable from the sequence with that feature (compare the notion of inessential feature in Kracht 1997; Pullum & Tiede 2010). The sequence (12d), for example, is formally indistinguishable from the partial order in (10b), since the first two rules are sufficient to cover all of the cells.10 If no feature in a sequence is redundant, we call it redundancy-free.

The careful reader may have noticed that in presenting the range of sequences generable from one inventory in (12) we gave only sequences that represent total orderings among the features of the inventory. However, in our exposition above, we also included sequences that contained only a subset of the features (as in (10)). It turns out that consideration of the total sequences is sufficient for calculating the range of partitions generated by an inventory, under the assumption of completeness, which we may define as follows:

(15) A sequence S is complete with respect to a paradigm P iff S generates a form (possibly zero) for every cell in P.

The partial sequence in (10) is complete, but the partial sequences in (16) are not (each leaves one cell without an exponent):

(16) a. f100f010⟩: AB__
  b. f001f011⟩: __AB

In general, as a simplification in what follows we will consider only non-redundant complete sequences, since this class is sufficient to exhaustively characterize the partition set of any inventory.11

2.4 Restricting the hypothesis space: the minimal valid inventory

Here, we define briefly the two conditions, and note a third assumption, that will be central in the investigation that follows. We suggest these are a priori plausible conditions to restrict the class of possible inventories, and we will work through their consequences in detail in sections 4 and 5 and appendix A below.

The first, basic condition is that an inventory be valid.

(17) An inventory I is valid for a paradigm P iff there exists a model M including I that generates the maximally differentiated partition of P.

The maximally distinct partition of a paradigm (space) is the partition in which each cell is distinct from every other cell. In other words, for a three cell paradigm, a valid inventory is one for which there is some set of rules that will derive ABC. Note that Validity is a property of inventories, not models (grammars). The rules of exponence in (8) demonstrate that (7) is a valid inventory, but we do not require that every grammar (model) generate ABC; syncretism by definition precludes there being such a requirement for every model.12 The model consisting of (7)+ (9) is perfectly well formed (and examples apparently conforming to such ABB patterns are widely instantiated).

The second, and more interesting condition on inventories is Minimality:

(18) An inventory I constitutes a Minimal Valid Feature Inventory for some paradigm P iff
  a. I is valid for P, and
  b. there is no subset I′ of I s.t. I′ is also a valid inventory for P and I′ has fewer features than I

In Section 4 and Appendix A, we will work through these conditions for various sizes of paradigms, starting with 2-celled and then 3-celled paradigms. One finding in this paper is that the two simple assumptions on inventories just noted – that inventories use the minimal number of features to describe a paradigm space – have the curious effect that in certain paradigm spaces, notably those with three cells, certain patterns of syncretism become unstatable. In a sense to be made clear below, ABA patterns of a certain type are indescribable. More accurately, no minimal valid inventory yields a paradigm set that includes all three bifurcations of the three celled space: {AAB,ABA,ABB}. If two are included, the third is not. Since we have treated order in a paradigm as arbitrary, all of the results we describe hold only up to linear permutations in this way. This result is of interest, because it arises without the nesting/containment assumption that plays a central role in other treatments of *ABA generalizations (Bobaljik 2012; Caha 2009; Starke 2009). Another result is a curious pattern in the nature of the restrictiveness that these assumptions create.

Lastly, at least to start, we will assume following standard practice in morphology that intersection of features is available. If fa and fb are in an inventory, then fafbx is a well-formed rule of exponence. (I.e., in more standard notation, if [F] ↦ A and [G] ↦ B are well-formed rules of exponence, then so is [F,G] ↦ C.) Most feature-based morphological analyses invoke this (for example, if [FEMININE] and [PLURAL] are features in the inventory, then there can be an unanalyzable exponent of [FEMININE,PLURAL], without needing a separate feature [FEMPL]). We assume that intersection is the only Boolean operation on features that is available (but see below for further discussion).

Adding intersection is not innocuous. Because of the way we have defined features and inventories, intersection intersects with Minimality in a non-trivial fashion. Intersection, like rule ordering, allows for exponents that do not directly conform to features that are in the inventory. If an inventory consists only of f110 and f101, a rule can be stated referring to: f110f101 = f100. While this generates an exponent that only expresses the first cell, it does so without the feature f100 being contained in the inventory. This will play an important role in the discussion of 3-cell paradigms. Of course, it is worth considering the consequences of minimality and validity without the additional assumption that intersection is available. We do so in section 5.4 below. Note that the core result holds either way, but the inclusion of intersection is a widespread assumption in morphology, so we consider that scenario first.13

Before proceeding to the discussion of our main results, we will present a number of additional concepts and assumptions which we hope will allow for more familiarity with the general notation. In particular, we offer some remarks on how various current ideas can be rendered in our notation, allowing for commensurability among analyses or frameworks. The reader interested primarily in the consequences of the assumptions just made can skip ahead to section 4.

3 Additional considerations

Our formalism allows us to selectively add or subtract conditions in order to examine the consequences of any particular set of assumptions. In principle, we can translate sets of assumptions from other feature logics into our notation, and can thus accurately investigate the algebra of different combinations. The following subsections illustrate some well-discussed conditions in the field, showing how they can be expressed and evaluated in our terms.

3.1 Extrinsic Order and Pāṇinian Sequences

Above we have noted that sequences (i.e., rules of exponence) must in some cases be ordered. The sequences in (19) contain the same features, but the difference in order alone yields different partitions:

(19) a. f110f011⟩: AAB
  b. f011f110⟩: ABB

As is well known from early discussions of rule systems, rule ordering may be extrinsic (a stipulated language-particular order, as in (19)) or intrinsic, i.e., such that more specific rules automatically bleed more general rules. A specific formulation of the intrinsic Pāṇinian ordering principle or Elsewhere Condition is as in (20) (after Kiparsky 1973):

(20) If two (incompatible) rules R1, R2 may apply to a given structure, and the context for application of R1 is a (proper) subset of the context for that of R2, then R1 applies and R2 does not.

We translate the operative notion into our set up as in (21), which picks out the class of sequences for which any reordering that does not introduce redundancy has no effect on the partition set. That is, a Pāṇinian sequence is not necessarily a total order, but all order with any consequence is determined by (20).

(21) A (redundancy-free) sequence S is a Pāṇinian sequence if and only if any redundancy-free permutation of S yields the same partition as S.

The sequences in (19) do not satisfy (21). Since (19a) and (19b) are permutations of one another and yield different partitions, neither of them is Pāṇinian. Other than (12a), the sequences in (12) likewise cannot be Pāṇini-sequences since they are not redundancy-free. But the redundancy-free sequence in (22a) is Pāṇinian because it and its only redundancy-free permutation in (22b) yield the same partition: ABC.

(22) a. f010f011f110⟩: ABC
  b. f010f110f011⟩: ABC

This raises two points. First, Pāṇinian ordering is the kind of general condition one could entertain as a restriction on rule systems. As the comparison of (19) and (22) shows, imposing a condition that sequences must be Pāṇinian may reduce the partition set for some inventory I by excising all partitions that are derived only by non-Pāṇinian sequences. Rather than build this assumption in, we see our goal as investigating the effects of assumptions, since we have a notational apparatus that allows us to directly compare systems with and without such an assumption. As it happens (we will present this in more detail below), imposing Pāṇinian ordering will have a drastic effect on minimal feature inventories that are closed under intersection, essentially preventing analysis of syncretism in the three-cell cases.

Now consider again the observation that the sequences in (19) are not Pāṇinian, but those in (22) are. The features invoked in these sequences are not unrelated. The features f110 and f011 are common to all of these sequences, and the relation between them is that they have partial overlap, but neither is contained in the other. Generally such a relationship between two features fa and fb is what makes a sequence non-Pāṇinian, unless there is another feature or other features fc1, …, fcn that cover the intersection of fa and fb and are contained within both fa and fb. One easy case is that there is only one other feature fc (i.e. n = 1), namely the intersection or conjunction of the two features fa and fb. This is what we see in (22): adding the feature that corresponds to the intersection of the two features in (19) renders the sequences Pāṇinian. Generally, if a redundancy-free sequence is closed under intersection, then it is Pāṇinian. This provides another reason for us to include intersection: allowing intersection makes it easier to compare intrinsically ordered and Pāṇinian analyses. It does, though, raise the question of whether any other Boolean operations on features should be countenanced.

3.2 Feature Algebra: Privativity, Defaults, Containment, and Intersection

While all Boolean operations are generally assumed to be available in semantics, most work in morphology assumes that the feature algebra is restricted and that e.g. the union operation is not available.14 As noted, we adopt the assumption that intersection, but no other algebraic operation is part of morphology (we do however consider systems without this assumption in section 5.4). We briefly mention some alternatives in the following subsections.

3.2.1 Binarity and Dimensions

Consider the example of binary features. Our features are, by definition, privative, rather than binary, in the sense that these terms are understood in the morphological and phonological literature. Binary features, of the sort typically written [±F] are, in our terms, names for pairs of features: one feature that names a set of cells, and another feature that names the complement set. In our terms, feature binarity could be expressed by holding that if f1100 is a feature in some inventory, then f0011 is also a feature in that inventory, etc. Assuming binary features is tantamount to assuming privative features along with a negation operation that is restricted to atomic (non-derived) features.15 We accord no special status to pairs of features in this way: an inventory containing f0011 may or may not contain the complement as a second feature (see also Pullum & Tiede 2010). In at least some cases, including two- and three-cell paradigms, imposing binarity complicates the analysis (cf. Corbett 2010).

Feature binarity is connected to the notion of dimensions in paradigms, raised by a reviewer. The type of representation entertained here readily accommodates multi-dimensional syncretism, a prima facie challenge for theoretical approaches, such as Nanosyntax, which adopt a universal total (containment) ordering among features (see Caha & Pantcheva 2012 for ideas on how to extend the Nanosyntax model to accommodate this.) We have thus far represented paradigms as one-dimensional lists, as in (23), although one often finds four-celled paradigms presented as a 2 × 2 matrix, encoded as two binary features, as in (24).

(23) <A,B,C,D>

(24)   –α
 
  β A B
  +β C D

Translation is straightforward: First one has to choose one order of the four cells in (24). This is arbitrary, but for concreteness we use the order ABCD as indicated in (24). The the feature α, shared by cells A and C in (24), is encoded relative to the list in (23) in our terms as f1010. Similarly, , shared by cells C and D, as f0011, etc. But dimensions of paradigms, underlying horizontal and vertical syncretisms, have no a priori special status—we can just as readily define a feature f1001 which picks out cells A and D, a diagonal syncretism in (24). For us, this flexibility is an advantage, since it allows us to take any existing partition set and probe what the optimal underlying feature inventory might be, given any combination of assumptions such as binarity, Pāṇinian order, Minimality etc. Rather than setting the features ahead of time, we can in this way discover whether features should be binary or not. In our terms, the two binary features that define the matrix in (24) are the inventory: {f1010, f0101, f1100, f0011}, but this is simply one of more than 32,000 inventories that could have been used in the analysis of a given four-cell paradigm.16

3.2.2 Defaults and containment

A default feature (or value) is one that is compatible in principle with all cells, i.e., f11…1, for a paradigm of any arbitrary size. Many approaches accord a special status to the default. For example, as a reviewer notes, theories that treat features as attribute:value pairs, or equivalents, such as category:feature/value etc., may allow for reference to the category as a whole as a default (Adger & Svenonius 2011 is a recent, explicit example of this, but the general approach has many antecedents). A three-celled paradigm could be described in these terms such that one element is the default, corresponding to the absence of a value for the attribute, as in (25), where the third line spells out the absence of a value (a category, but no feature, in Adger & Svenonius’s terms):

(25) f100 ↦ A
  f010 ↦ B
  f ↦ C

Such analyses are readily found in the literature. For example, analyses that treat the third person as the “absence” of person or the default person instantiate (25). But in our terms, (25) is simply a notational variant of (26). An underspecified, default exponent corresponds to an exponent that is compatible with any cell in the paradigm, and surfaces wherever it is not bled by an earlier rule.

(26) f100 ↦ A
  f010 ↦ B
  f111 ↦ C

Just as with feature binarity, we choose from the outset not to assign any privileged status to the default (or to inventories containing a default)—it is simply one feature among many to be considered. We consider full sets of inventories, including those that do and do not contain the default. Doing so allows us to compare the results of inventories that include the default with those that do not. For example, it could turn out that inventories that include the default as one of the features are more highly valued along some dimension than those that do not. But unlike Adger & Svenonius (2011), we do not build this assumption in from the start.

Related to defaults, there are also theories that build containment in as a prior assumption about feature inventories. Much of the *ABA literature relies on partial or total containment among classes of features in an inventory. The Nanosyntax framework codes this as an fseq, assumed to be universal and invariant across languages (Caha 2009). Other ABA literature (Bobaljik 2012; Smith et al. 2016) assumes containment in the contexts where ABA is excluded, but without a total commitment to invariant fseqs. As described above, feature containment relations can readily be expressed in our notation, as in (7). The fseq assumption would then elevate that to a general condition: for any two features fa, fb in an inventory, either fafb or fbfa. Once again, we do not impose a priori conditions of this sort, as our aim is to see whether these arise as plausible conditions from other considerations.

In the above paragraphs, we hope to have shown that any of a number of other conditions on inventories or sequences could be expressed in our system.17 Our primary strategy here is to limit building assumptions into our system, so that this will allow us, at least in principle, to consider the restrictiveness of various possible assumptions in the abstract, and to allow for direct formal comparison of classes of competing frameworks. We now begin the process of exploring the consequences of the assumptions we did suggest for paradigms of different sizes.

4 Two-Cell Paradigms

Consider first the case of paradigms with two cells. Analysis of a two-cell paradigm space is relatively trivial, but serves as a warm up for the more interesting cases, and offers an opportunity to become more familiar with the notation for presenting the analysis.

For the analysis of a two-celled paradigm space, there are three logically possible features: f10, f01, and f11 – this corresponds to the general formula that for n-cells there are 2n – 1 possible features. From three features, eight distinct inventories of features may be defined, i.e., the power set of the features. Of these, we may discard the empty set – if there are no features, nothing can be described.

Of the seven remaining inventories, any inventory consisting of just a single feature will fail our criterion of Validity: The maximally differentiated partition of a two-celled paradigm space is AB, i.e., the two cells are distinct. Since our features are privative, a single feature is not sufficient to analyze the AB paradigm: If the single feature is f11 it isn’t possible to make the required distinction between the A and the B cell – the only partition that can be generated is AA. And if the single feature was either f10 or f01, no analysis of the two cell paradigm is possible at all. Only one cell could receive an exponent. Recall that we made the decision not to assign the ‘default’ f11 some special status but to include it as just one possible feature among many. Therefore if only a rule of exponence f10A is specified, the second cell wouldn’t be filled at all. Therefore this analysis fails to be valid under (17). This shows that at least 2 features are required to analyze the AB paradigm.

That two features are sufficient is shown by looking at Table 1. This table displays the four inventories with two or three features. For each inventory, the set of possible rules of exponence are given (redundant features are in parentheses), and in the rightmost column, the corresponding partitions that can be generated. As the table shows, any selection of two features from the three possible features will allow an analysis of the AB-paradigm. Thus these three subsets represent possibilities for a restrictive Universal Grammar satisfying Minimality—the three-feature inventory (#4) is excluded by this criterion.

# inventory sequence partition

1 f01, f11 f11 AA
f11, (f01) AA
f01, f11 AB
f01 **

2 f10, f11 f11 AA
f11, (f10) AA
f10, f11 AB
f10 **

3 f10, f01 f10, f01 AB
f01, f10 AB
f01 **
f10 **

4 f10, f01, f11 f10, f01, (f11) AB
f01, f10, (f11) AB
f10, f11, (f01) AB
f11, (f01, f10) AA
f01 **
f10 **

Table 1

Table of valid feature inventories and corresponding partition sets for two-cells. ** = incomplete sequence.

The same information can be represented graphically as patterns of squares, here aligned vertically using colour to define features, exponents and partitions. In Table 2, we display the three valid systems that contain the minimal number of features, i.e., two features, in the two-cell case in this way. As we discuss immediately below, while there are three such distinct feature inventories, inventory 1 and inventory 2 predict the same sets of possible partitions. But inventory 3 predicts a smaller set of possible partitions, namely only the AB partition.18

# inventory sequences partitions

1

2

3

count 3 2

Table 2

Graphical display of table of minimal valid two-cell inventories and derivation of the predicted partitions. Compared to table 1, incomple sequences and redundant features are omitted here, and sequences that are order-invariant shown only once. Inventory 4 of table 1 is excluded here by minimality.

Inventory #1 is valid, since there is a sequences of features from this inventory, which generates the maximally differentiated partition AB. This sequence is in the first line: there are two, ordered rules of exponence (f01 ↦ B, and f11 ↦ A). As the table shows, the AA partition may also be generated from the same inventory. The first sequence provides a rule of exponence only for the feature f11. This generates the fully syncretic paradigm: AA. Continuing through the table, we see in this way that the first and the second possible universal inventory each allow two classes of languages corresponding to the partitions AA and AB. The third feature inventory, although Valid and Minimal, only predicts the AB partition as possibility. On this analysis, if AA were to surface in any language it would need to be the result of accidental homophony.

The fourth inventory contains all three features and therefore allows 6 sequences with rules of exponence for 3 features, 6 sequences with 2 features, and 3 with single features, which we show in a condensed form in Table 1. However, as noted, this inventory fails the Minimality condition.

Recall from above that we defined the partition set of an inventory as the set of paradigms that can be derived from it. For the three minimal complete inventories of the two cell case, the partition sets can be read off the partitions column of Table 2.

Typological evidence ultimately can inform us which partitions are attested.19 If a typological survey shows that both AA and AB patterns exist, then the the third inventory, though valid and minimal, is not the actual inventory made available by UG. We note in passing that it is the only minimal valid analysis that uses a binary feature, rather than the equivalent of a default and “marked” combination.

However, the typological evidence cannot alone decide between different inventories that both predict the same possible partitions like inventories #1 and #2 above.

Despite the relatively trivial nature of the exercise with the two-cell paradigm space, the preceding discussion demonstrates that assumptions have consequences, and the the assumption that UG inventories be both minimal and valid has reduced the space of possible inventories from 7 (or 8 with the empty set) to 3. We have shown how typological evidence can be brought to bear on the choice. Finally, we note that the two minimal valid inventories that are capable of generating both AA and AB patterns are in fact related to one another by a permutation of the cells. For example, the inventory f01, f11 contains a default feature (naming both cells) and a specific feature naming the second cell in the list (“Y” in <X,Y>). This is equivalent to the inventory f10, f11 relative to a permutation of the list, that is, in which the specific feature names the first cell in the list <Y,X>. Since we have taken the order of cells in a list to be arbitrary, there is no way on our assumptions to distinguish among inventories that are permutations of one another in this way. To take a more concrete example, in order to describe a two-way number contrast, one could invoke two features: singular and plural, corresponding to inventory #3, or posit a single marked number value (f01) and leave the other unmarked (f11 = number) (or combine these to use two marked features and a default). Minimality prefers one of the first three inventories; if there is syncretism in some paradigms, then one of the first two is to be preferred. But considerations of Minimality and Validity alone do not resolve the venerable debate about which value of number is marked (Sauerland et al. 2005 and others): f01 corresponds to “plural” if the cells are ordered <singular,plural>, but to “singular” if the cells are ordered <plural,singular>, and vice versa for f10.

Our first result is that in a paradigm space that constitutes only a binary opposition, the only minimal valid analyses that also permit syncretism are the ones that takes UG to have a single feature that names one member of the opposition, and which is contrasted with a default feature, compatible with both members. In this way, there would be an empirically-grounded argument to be made that if Minimality is assumed, then Binarity should be rejected as a general condition on feature inventories. In the manner just noted, the two assumptions make contrasting predictions about the state of the world. But we have no way on these considerations alone of saying which member of the opposition is ‘marked.’

5 Three-Cell Paradigms

Turning to the three-cell paradigm space, we begin to see the growth in the space of analytical possibilities, and we also see how various assumptions such as Minimality and intrinsic, i.e., Pāṇinian ordering restrict that space. For a three-cell paradigm space, there are 23 – 1 = 7 possible features, listed in (27):

(27) f100f010f001f110f101f011f111

If features could be freely chosen to form inventories, then 128 distinct feature inventories could in principle be constructed from these features (including the empty set), i.e., (22n–1, a function with double exponential growth). Each inventory is in turn relatable to n! (total) sequences.

5.1 Restrictions

We now consider the degree to which the assumptions mentioned above restrict the space of possible grammars (analyses).

5.1.1 Validity

The first restriction we impose is Validity, as in (17). For example, the inventory f100, f010, f001 is valid, in that it describes a three-way contrast, while the inventory f100, f110, f010 is invalid – it is not complete, as it provides no means to describe the third cell. It turns out that 96 of the 128 possible inventories of active features are valid in this sense in the three cell case (see Table 3 below). With four cells, the ratio is 31,962 out of 32,768 (see Table A2 below). Validity thus restricts the number of feature sets, but the restriction is not particularly strong.


2 features, order 1 1 1
3 features, order 1 2 2 2 3 3 3 1 3 3 3 3
4 features, order 1 1 1 3 3 3 3 2 2 2 14
5 features, order 1 1 1 3 15
6 features, order 1 6
7 features, order 1
total with order 1 3 3 3 0 8 8 8 0 0 0 8 5 5 5 39
2 features, Pāṇini 3
3 features, Pāṇini 4 4 4 4 1 3 3 3 3
4 features, Pāṇini 3 3 3 1 1 1 3 2 2 2 14
5 features, Pāṇini 1 1 1 3 15
6 features, Pāṇini 1 6
7 features, Pāṇini 1
total Pāṇini 7 7 7 7 0 2 2 2 0 0 0 8 5 5 5 39

Table 3

Table showing which 12 of the 16 logically possible three cell partition sets can be generated by feature inventories (applying intersective closure) and by how many inventories. Partition sets related by cell permutation are grouped together.

5.1.2 Minimal Feature Inventory

The more interesting (and less obviously empirically motivated) requirement is Minimality. As defined above, a minimal valid feature inventory is an inventory that contains the minimal number of features needed to describe the maximally differentiated partition. For the two-cell space, the minimality requirement does not restrict the possibilities in any interesting way (it excludes only one inventory out of the 4 valid ones), but for the three-cell space, the minimal number of features that is needed to describe the maximally differentiated partition is two, as we show presently. Validity plus Minimality together thus restrict the choice from among 128 different logically possible feature inventories to the following three:

(28) a. f110f101
  b. f101f011
  c. f110f011

To see that (28) are the minimal valid inventories, consider first that they are indeed each valid: (29) gives the rules of exponence that generate the maximally differentiated partition from the inventory in (28c). For (28a) and (28b) analogous sequences can be specified.

(29) a. f110 ∩ f011 = f010 ↦ B
  b. f110 ↦ A
  c. f011 ↦ C

Now consider minimality: Obviously, no inventory with fewer than two features can be valid, hence we only need to show that the inventories in (28) are the only valid two cell inventories. Assume that there was another valid inventory I with only two features. Because (28) lists all combinations of the features f110, f101, and f011, I would need to contain one of f100, f010 and f001, or f111. But it is easy to see that for any these features, it is impossible to satisfy validity by only adding one further feature to I. Hence, (28) are the three minimal inventories for three cells.

5.1.3 Order and the Effect of Pāṇini

Note that in order to describe the ABC pattern, the rules of exponence must be (partially) ordered, such that the exponent of the conjoined features takes preference over the rules in (29b)–(29c) (this holds for any of the three inventories in (28)). The property of Order was not relevant in the two-cell paradigm, but the Pāṇinian order condition has a strong effect with the valid three cell inventories.

Because each of the inventories in (28) has two basic features that may be conjoined to define a third feature, the number of possible sequences for each inventory is 16, although many of these sequences will be redundant or incomplete. The order (29) is in addition to deriving the complete partition also Pāṇinian as defined above. To see this consider first that if the order of (29b) and (29c) is changed as in (30), the resulting sequence still derives the complete partition ABC.

(30) a. f110 ∩ f011 = f010 ↦ B
  b. f011 ↦ C
  c. f110 ↦ A

Other orders of the rules in (29) render feature f010 redundant. For example, if the order of the first two rules in (30) is exchanged, then the rule f110 ↦ A will assign the exponent A to the first two cells, bleeding rule (31b) (i.e., rendering rule (31b) redundant).

(31) a. f110 ↦ A
  b. f110 ∩ f011 = f010 ↦ B
  c. f011 ↦ C

As a general property (well understood from studies of Rule Ordering), ordering fa before fafb will render the conjunction redundant, and is thus equivalent to not selecting (or having no rule referencing) the conjoined feature. This corresponds to an intrinsic order: if the conjoined rule is active, it must be ordered before its individual conjuncts.

If the conjoined rule is omitted or not ordered first, it can be omitted and only the order between the two rules referring to the basic features matters. The resulting sequences are not Pāṇinian, but require extrinsic order. (32) yields AAC while (33) yields ACC.

(32) a. f110 ↦ A
  b. f011 ↦ C
(33) a. f011 ↦ C
  b. f110 ↦ A

The following table shows, for one inventory, the six possible sequences (six distinct orders of three rules) and the three corresponding partitions that are derived. (As before, redundant elements in the sequences are in parentheses). The analogous table for the other two choices can be readily constructed. As an expository device, we use green text to indicate a feature that is derived as the intersection of the two basic features. As explained in section 2.4, the green features are not part of the feature inventory, but are a convenient abbreviation for rules of exponence that make reference to the intersection of two features in their structural description.

(34) inventory sequences partition
 
  f110f011 f010f110f011 ABC
    f010f011f110 ABC
   
    f110, (f010), f011 AAC
    f110f011, (f010) AAC
   
    f011, (f010), f110 ACC
    f011f110, (f010) ACC
 
      3

What (34) shows is the following: There are (only) three minimal valid feature inventories that can generate a maximally differentiated three-celled paradigm space. One such inventory is {f110, f011}. From that inventory, 6 (=3!) sequences may be formulated, where each sequence is a distinct, total ordering of rules of exponence for the two features and their intersection.20 While there are six rule orderings possible, only three distinct partitions are generated. The first two lines in (34) derive the same surface patterns (partitions), since the ordering of the last two rules is irrelevant.

As the reader may verify, the other two minimal valid inventories (in (28)) have the same properties as (34). The three inventories amount to permutations in the order of the cells, but are otherwise identical in their formal properties. Each inventory generates a partition set that contains only two of the three logically possible bifurcations of the paradigm. Since we have not stipulated a meaningful order of the paradigm cells, the three are equivalent, up to linear order.

The information in (34) is represented graphically in (35):

(35) universal features sequences partition
 
 
   
   

5.2 Result: Generalized *ABA

At this point, we note two properties we believe to be of theoretical interest. For a three-celled paradigm space, there are B3 = 5 distinct partitions. However, imposing the conditions of Validity and Minimality on the UG feature inventories restricts the expressive power of the system, such that each inventory generates only 3 of the 5 possible partitions. The three inventories that are permitted are moreover linear permutations of one another. We believe this is of interest since it appears to be true at least in some domains that the number of attested partitions is a small subset of the logically possible ones. The example we noted above was that in the 8 cell division of the person/number space, only 60-some-odd distinct partitions, out of B8 = 4,140 possibilities, are attested in Cysouw’s 250+ language sample. Being able to predict restrictions on the space of possibilities is thus of potential theoretical interest, if the restrictions indeed line up with the data. In the case at hand, the following restrictions obtain:

Of the five possible partitions of a three-cell space, four show some differentiation among the cells. However, each of the inventories in (28) generates only three of those partitions. As in the case of inventory #3 in the two-celled paradigms, we are now able to connect our formal results to potential empirical evidence. If there is, as we have hypothesized, a fact of the matter for some domain, such that UG contains only one of the inventories in (28), then this should show up as the following empirical generalization: across the relevant domain, only three of the four possible patterns of differentiated partition should be attested. In (34), we show that the inventory f110, f011 generates the partition set {ABC, AAB, ABB}; that inventory does not generate ABA. No sequence from that inventory will generate a pattern in which the first and last cell share an exponent, to the exclusion of the middle cell.

The same holds for the other two inventories, up to the linear order of the cells: each inventory will fail to generate exactly one of the possible partly syncretic partitions. Inventory f110, f101 in (28)a generates the partition set {ABC, AAB, and ABA}, but it does not generate ABB. Similarly, the inventory f101, f011 in (28b) generates the partition set {ABC, ABB, and ABA} but does not generate AAB. As discussed above, since the linear order of the cells is arbitrary, these inventories and partition sets are permutations of one another, and thus each can be reduced to the first via permutation of the cells. For example, syncretism of the accusative and dative, to the exclusion of the nominative (as in (3b)) would be described as an ABA pattern if the order of cases were ACC-NOM-DAT, but if we permute the order of cells, giving the list NOM-ACC-DAT as in (2), then the same pattern is described as an ABB pattern. In this way, there is, in what we develop here, a formal equivalence among partition sets that differ only as a function of linear permutations of the cells. We cannot, in principle, say that *ABA is excluded absolutely (rather than *AAB, for example, since what counts as ABA under one order counts as AAB under a linear permutation), but what we have found is instead a generalized version of *ABA: the three inventories in (28) all exclude precisely one pattern in which two cells are syncretic and one distinct. They either exclude *ABA or are reducible to this by linear permutation alone.

This result is noteworthy in the current context, since it provides a means of characterizing the absence of *ABA patterns without assuming featural containment. Existing accounts of *ABA patterns invoking containment are all built on what, in our terms, is a non-minimal feature structure, with strict nesting of features – some version of: f100, f110, f111.

In other words, what we have just shown has two parts. The easy part is a demonstration that it is possible to derive a *ABA generalization for some domain without invoking containment. We have just done so. The slightly harder part was the demonstration that the type of feature inventory that derives *ABA without containment is not only possible, but is in fact preferred (over containment), if UG makes use of Minimal Valid feature inventories. We postpone until the next section some speculative remarks on whether this result constitutes a plausible alternative scenario for the account of *ABA generalization examples in the literature.

Before that discussion, we note one further point about these inventories. No valid, minimal inventory for a 3-cell paradigm space generates the maximally undifferentiated partition AAA. Curiously, it is not a general property of our assumptions that such undifferentiated partitions are universally excluded in the minimally valid inventories, and we show below that it does not hold for four cells. We can say that at this point that the undifferentiated partitions are excluded when the number of cells is from the sequence 2n – 1 for n ≥ 2, i.e. 3, 7, 15, …. We note this, but leave it as an unexplored aspect of the system. Total syncretism appears to exist, of course, and we do not exclude it across the board. We return to this issue again in section 5.4, where we show that giving up the assumption that intersection is always available will preserve the generalized *ABA result considered here, but will admit AAA patterns. The upshot of that section will be that the (equivalent of the) 3 inventories considered to be minimal valid inventories with intersection become three among a larger class of minimally valid inventories (including the containment patterns). Some inventories from among the larger class permit AAA, but the general result holds: no member of that larger class admits all three bifurcations of the paradigm space: any minimal valid inventory whose partition set contains ABB and AAB will necessarily exclude ABA.

5.3 More on the 3-cell space: Non-Minimal sets

Thus far, we have examined only the three minimal valid inventories that generate a three-cell paradigm. To evaluate the effect of minimality, we now look also at non-minimal inventories. In the two-cell case, we were able to present a complete discussion of all the possible inventories and of the partition sets described by each inventory. There were only 8 possible inventories for the features definable over a two-cell paradigm, and 4 inventories were invalid. But for a three cell space, there are 128 inventories, and numerous sequences to consider.

Table 3 provides a summary of important aspects of the grammar of three-celled paradigms and the models that generate them. In the next paragraphs, we walk through this table in some detail, identifying various properties that are of potential interest. Among these, we note that imposing Pāṇinian ordering—limiting all models to intrinsic rule ordering—turns out to have rather drastic consequences. Possibly of more interest, we note that there are some partition sets that do not arise under any constellation of the assumptions considered here. Even without Minimality, for example, feature inventories turn out to be somewhat restrictive.

Table 3 is divided horizontally into two halves. Each half tabulates all the valid feature inventories, and counts inventories grouped by the number of features they contain (y-axis) × the partition sets that may be generated from them (x-axis). The two halves of the table differ as follows: In the top half, it is assumed that extrinisic order of rules of exponence is permitted, while in the bottom half, we add the additional assumption that only intrinsic (Pāṇinian) ordering is permitted. We discuss the differences below.

The columns in Table 3 represent possible partition sets of a three-cell paradigm space, using colour instead of letters, as in (35) above: the same colour in two cells indicates the same exponent (syncretism). There are B3 = 5 distinct partitions (the rightmost column) and 16 different subsets of partition that contain the maximally differentiated partition (ABC = dark orange, light orange, light purple).

The header of each column represents a distinct partition set, and the number in a given column represents the number of formally distinct (valid) inventories that can in principle generate that set. The three minimal valid inventories that we have discussed above are in the top row of the top half of the table (columns 6–8). These are the only three valid, two-feature inventories. But the table provides a range of information about what happens if we do not include the minimality requirement.

In the leftmost column of the line “3 features, order”, one finds the number 1. Assuming extrinsic rule ordering is allowed, there is exactly one choice of an inventory with three features, from among the 7 possible features, which yields only an ABC partition. We have seen that already; it was the inventory in (5). If that inventory is chosen, from among the 128 possible inventories, then the only partition that can be generated is ABC.

On the same line, the number in the rightmost column is 3. There are (exactly) three distinct choices of feature inventories from each of which all five logically possible inventories can be derived. One such inventory is f110, f101, f111, i.e. it is derived from a valid two-feature inventory by adding the default f111. The other two inventories are also of this type; i.e., the two linear permutations of this inventory.21

This line also shows that there are 3-feature inventories that generate a partition set which excludes ABA. For example, the third columnn from the right notes that there are three inventories whose partition sets contain ABC, AAB, ABB, and AAA, but not ABA. One of the three inventories which generate this partition set is f001, f011, f111 as we saw above already (the containment inventory). A second possibility is f100, f110, f111 (a linear permutation of the previous one). Finally, the inventory f100, f001, f111 also generates this partition set, but without containment. Furthermore, all three inventories exclude *ABA from their corresponding partition set regardless of whether extrinsic ordering is allowed or not.

Bear in mind that the numbers in this table do not count models or sequences, but count inventories. Other than those in the leftmost column, each valid inventory in the table may be contained in multiple models, thus yielding sets of generable partitions. For example (34) (= (35)) is here coded by the number 1 in the top line, column 7. This is a two-feature inventory that generates the partition set at the top of column 7; moreover, this is the only choice of (two) features which generates that exact partition set (and requires extrinisic rule ordering to do so).

5.3.1 Results: Impossible Partition Sets

One point of interest is that there are four partition sets that are underivable no matter the size of the inventory: four columns total to zero (in fact the same four with or without a limitation to Pāṇinian ordering). As the fifth column shows, there is, for example, no valid inventory (minimal or otherwise) that has the partition set {ABC, AAA}. In other words, no combination of features will admit all and only the maximally and minimally differentiated partitions. Also excluded are patterns that allow ABC, AAA and exactly one syncretic grouping (columns 9–11).

This latter fact is particularly interesting, since the last of these (column 11) is what Bobaljik (2012) finds empirically for suppletion in adjective gradation: ABA and AAB are unattested, but the other patterns are allowed. Our result means that the suppletion pattern of gradation isn’t predicted by any variation of the morphological assumptions we consider here – i.e. whether Pāṇini, Minimality or other similar conditions are assumed. However, Bobaljik also proposes to separate the component accounts of *ABA from *AAB in adjectival gradation, arguing that only *ABA is excluded by the logic of features and syncretism, and proposes an additional, syntactic locality condition to exclude *AAB (see also Bobaljik & Wurmbrand 2013).

5.3.2 Pāṇini

Before leaving the domain of three-cell paradigm spaces, we will consider the effect of one additional restriction, namely the idea that there is no extrinsic ordering of rules, and only Pāṇinian ordering. Each of the three valid, minimal feature inventories makes use of two basic overlapping features, and derives a third by using the intersection of those two. We showed above that reordering the rules has the effect of deriving syncretic patterns, in effect, by rendering the intersective feature redundant. The order in (31a) is equivalent to a system that uses only the two basic features, but not their conjunction.

We may consider imposing Pāṇinian-order-only as a restriction on valid sequences, corresponding to the hypothesis that grammars make use of only intrinsic, but not extrinsic, ordering of rules. Comparing the top and bottom halves of Table 3 allows us to evaluate the effects of this assumption, for three-celled paradigms.

One result which we find interesting is that for 3-cell paradigms, imposing Pāṇinian ordering has no effect on the total number of valid inventories. (This turns out to be different for 4-celled paradigms). We simply note this here, without further comment.

However, comparing the first line of each half of the table shows that imposing Pāṇinian ordering in addition to Minimality is a severe restriction. This constellation of assumptions has the effect that only the maximally distinct partition is describable (the leftmost column in Table 3). All three valid minimal inventories will derive that order and no other. Technically, intrinsic ordering does not restrict the relative order of f110 and f011, but since the conjunction will identify the middle cell, the remaining ordering is free (the two are non-distinct).

Since syncretism is abundant in paradigms of all sizes, imposing a Pāṇinian ordering, along with the other assumptions considered above, seems, in its combination with intersection, pathologically over-restrictive. Somewhat different results obtain if we do not assume that intersection is freely available, so we turn to that now.

5.4 Deriving *ABA in Intersection Free Systems

As we mentioned in section 3.2 above, assuming that intersection of features is available to rules of exponence accords with standard practice in morphological theory. Systems where the feature set is closed under intersection, combined with the assumptions of minimality and validity, yield tight restrictions on paradigm sets, including one that seems to be of special interest in current morphology and therefore we have focussed so far on systems with intersetcion. In this section we discuss what happens if we drop this requirement. In particular, we show that these allow a different route to derive a generalized *ABA constraint. Table 4 shows an overview of the possibilities for deriving the 16 valid partition sets for the three cell case when feature intersection isn’t available.


2 features, order
3 features, order 1 2 2 2 3 3 3 3 3 3
4 features, order 1 1 1 3 3 3 3 4 4 4 7
5 features, order 1 1 1 3 1 1 1 12
6 features, order 1 6
7 features, order 1
total with order 1 3 3 3 0 7 7 7 0 0 0 7 8 8 8 26
2 features, Pāṇini
3 features, Pāṇini 4 4 4 4 3 3 3
4 features, Pāṇini 4 4 4 1 1 1 4 4 4 7
5 features, Pāṇini 2 2 2 1 1 1 12
6 features, Pāṇini 1 6
7 features, Pāṇini 1
total Pāṇini 4 8 8 8 0 3 3 3 0 0 0 1 8 8 8 26

Table 4

Table showing which 12 of the 16 logically possible three cell partition sets can be generated by feature inventories (not applying intersective closure) and by how many inventories. Partition sets related by cell permutation are grouped together.

One consequence of assuming that intersection is not available for morphological feature algebras is that a valid inventory must contain at least three features.22 This can be seen as follows: the minimally valid inventories with two features when intersection is available contain two features such as f110 and f011 which each contain two cells. But if intersection is not available, such a system cannot describe the maximally differentiated paradigm and thus is not valid. In this case, {f110, f011, f010} is available only as a three feature inventory – the feature f010 corresponding to the intersection of the other two features must be included explicitly.

Generally, any partition set that can be generated in a morphology that allows intersection can also be generated in an non-intersective one by explicitly adding the features intersection would derive. In the three-cell case, this relationship holds also in the reverse direction as the comparison of Table 4 with 3 shows: any partition set that can be generated non-intersectively can also be generated in a system with intersective closure.23

Returning to the *ABA constraint, and related considerations of restrictiveness, there are results that we believe should be of interest here. First, we note that admitting or banning intersection has no role on the overall impossibility of the four partition sets identified in section 5.3.1 as ungenerable. But rejecting intersection does make a difference in the definition of minimal valid inventories. In (the top half of) Table 3, where intersection is admitted, there are exactly three minimal valid inventories, which are linear permutations of one another, all of which derive a partition set with exactly three members. When intersection is not admitted, the number of minimal valid inventories increases to 25. Even so, the system is restrictive: without the Pāṇinian restriction, only 10 of 16 conceivable partition sets are generable.24 None of the minimal valid inventories generates the unrestricted partition set (the rightmost column in Table 4) and none generates the partition set that excludes only AAA (column 12). Note that that pattern is derivable from non-minimal feature inventories, as indicated in the table. Hence it is Minimality that is playing a key role in excluding that partition set.

In other words, we include the following among our results. Regardless of whether intersection is admitted, and regardless of whether Pāṇinian ordering is enforced, the assumption of Minimal Validity as a condition on feature inventories ensures the following:

(36) No minimal valid feature inventory for a 3-cell paradigm space includes all three bifurcations of the paradigm in its partition set.

These results amount to a generalization of the *ABA generalization up to linear permutation. A special case of (36) is the rightmost column of Tables 3 and 4: no minimal valid feature inventory generates an unrestricted partition set. The assumption of minimality always entails a restriction. Another special case is the implication that if any two bifurcations of the three-celled space are in the partition set of (minimal valid) inventory, then the third is not. Up to linear permutations of the cell orders, this is the *ABA generalization: if AAB and ABB are admissable paradigms, then ABA is not.25

6 *ABA – empirical considerations

Coming up out of the heady sea of numbers for air, we are now at a point to step back and ask whether the results of our investigation of the formal combinatorics of features has any bearing on the actual *ABA generalizations discussed in the literature. Our tentative conclusion is that some domains where a *ABA generalization is observed do not seem to conform to the profile of the minimal valid inventory (with intersection), while for others, the situation is less clear, and the minimal valid inventory, with overlapping features, rather than containment, seems to us to be a direction worth pursuing.

We opened this article with reference to the *ABA generalization in adjectival gradation, investigated extensively in Bobaljik (2012). We see no reason from the discussion here to think that it would be profitable to reanalyze that as arising from a minimal valid 2-feature inventory. Doing so would invoke two privative features, one shared by the positive and comparative grade (but not the superlative), and another shared by the comparative and superlative, but not the positive. There is, however, fairly extensive evidence independent of patterns of suppletion for a containment relation in adjectival gradation: the superlative transparently contains the comparative in many languages.26 Some examples are given here (from Bobaljik 2012:31):

(37)     POS CMPR SPRL  
 
  a. Persian: kam kam-tar kam-tar-in ‘little’
  b. Cimbrian: šüa šüan-ar šüan-ar-ste ‘pretty’
  c. Czech: mlad mlad-ší nej-mlad-ší ‘young’
  d. Hungarian: nagy nagy-obb leg-nagy-obb ‘big’
  e. Latvian: zil-ais zil-âk-ais vis-zil-âk-ais ‘blue’
  f. Ubykh: nüs◦ə ç’a-nüs◦ə a-ç’a-nüs◦ə ‘pretty’

In addition, it is not at all obvious that it makes sense to consider adjectival degrees as grammatical features, in the way that, for example, classificatory elements such as gender are.

On the other hand, there are other domains in which *ABA generalizations have been observed, where there is less independent reason to think that the constituent elements are arranged in a containment relation.

One such domain, perhaps, is person. Vanden Wyngaerd (2016) sees a *ABA generalization in (plural) independent pronouns. Building on prior cross-linguistic investigations (Cysouw 2003; Baerman et al. 2005), he observes that there are languages where first and second (plural) pronouns are syncretic, contrasting with the third person (such as Slave, in (38), from Cysouw 2003:124), and there are languages where second and third (plural) are syncretic, contrasting to the first person (as in the Nez Perce ‘unmarked’ pronouns in (39), Cysouw 2003), but virtually no good examples of syncretism of first and third person, contrasted with second.27

(38)   SG PL
 
  1 naxį
  2 naxį
  3 Ɂedį Ɂegedį
(39)   SG PL
 
  1 ’íin núun
  2 ’íim ’imé
  3 ‘ipí ’imé

Vanden Wyngaerd (2016) argues for a containment relation among the features that define person, as in the following:28

(40) a. 1st: [ [[PERSONPARTICIPANTAUTHOR]
  b. 2nd:   [[PERSONPARTICIPANT]
  c. 3rd:   [PERSON]

In our terms, this is (a linear permutation of) the inventory in (7): f100, f110, f111 and its properties are well understood. However, there are few, if any, languages in which such a decomposition of pronouns is transparently manifest in surface forms. As we have seen above, this inventory is valid, but non-minimal. A minimal valid inventory would be one that composes the three persons out of two privative features: f110 corresponding to the feature ‘participant’, and f011, which is in essence the privative feature ‘non-author’. On this alternative analysis, first and third person pronouns cannot be syncretic, excluding the second person, since they share no feature. Hence *ABA. Ackema & Neeleman (2013:905) offer an analysis in essentially these terms, motivated in large part by the patterns of syncretism. As noted above, Ackema and Neeleman’s approach to features treats them as functions that operate on a set of discourse referents, but the key point is the proposal that first and second person share a feature, as do second and third person, but first and third do not.29

In work in progress (see Sauerland & Bobaljik 2013) we are exploring the typology of syncretism in person feature inventories more broadly, drawing on the extensive data in Cysouw (2003), to determine what feature inventory assigns a high likelihood to a pattern like the observed partition sets, not just in plural pronouns, but in the full range of person marking systems, including clusivity distinctions. We may wager that if we are right to suspect a minimal valid inventory at work in the patterns of syncretism in the free-standing pronouns, then we will see that emerge as well in the larger study.

Before closing, we note as well that *ABA generalizations have also been noted in verbal inflection (Wiese 2008; Starke 2009), case (Caha 2009; Smith et al. 2016), and number (Smith et al. 2016). Of these, case is another domain in which there is little independent morphological evidence for containment relations, at least among ‘core’ cases.30

Pavel Caha (personal communication and to appear) calls our attention to at least one sub-part of the case hierarchy which appears to reflect the kind of feature structure we would expect on the approach taken here. Blansitt (1988) surveys the marking of the following four functions across the world’s languages: direct object, dative (recipient), allative (goal of motion), and location. Blansitt notes a generalization, exceptionless in transitive clauses, whereby no two functions are marked identically unless all intervening functions in the order just given are also marked identically. In other words, a *AB(B)A generalization. One way to approach this, following Caha (2009) (but see also Caha & Pantcheva 2012), would be to assume that there is a monotonic containment relationship among the features (we consider the last three for ease of exposition):

(41) a. f111 = dative
  b. f011 = allative
  c. f001 = locative

An alternative, following the approach laid out here, would be the minimal valid inventory in (42):

(42) a. f110 = dative
  b. f011 = locative

From this inventory, the allative can be described as the intersection of the other two cases. As Caha notes, Blansitt offers at least one language that seems to transparently reflect (42) rather than (41). Tigrinya prepositions include ne which marks dative (and some objects, presumably an instance of differential object marking, which quite commonly uses the dative, Bossong 1985) and locative ab. The allative is marked by the conjunction of the two: nab < ne ab. This is also broadly consistent with the results of Radkevich (2010) who found no evidence of a simple, monotonic transparent relationship among local cases as (41) might predict (although her survey also finds cases of portmanteaus and internally complex case morphology that are equally hard to reconcile with (42)).

7 Conclusion

In this paper, we introduced a notation for approaching feature logic from an algebraic perspective, abstracting away both from any empirical consideration and from any assignment of particular meanings to the features. Features are merely names for addresses (cells or groups of cells) in a list. In this way, we provided a calculus by which one can derive the paradigm set corresponding to any inventory of features, under varying sets of assumptions. This has two benefits. In the first place, we can investigate the formal properties of adding or subtracting individual assumptions, translating competing approaches into a common notation and working through the consequences at a formal level. The size of the partition set derivable from any inventory serves as a measure of restrictiveness—combinations of assumptions that decrease the number of partition sets are more restrictive.

We have shown a number of results that are, we hope, of potential interest regarding three-celled paradigms. One of these is that certain partition sets are indescribable—no inventory of features yields exactly these partition sets without further assumptions. This group includes the set that has only the maximally differentiated and undifferentiated partitions: AAA, ABC (column 5 in Tables 3 and 4), as well as the three that allow only one of the three bifurcations in addition.

Another result arises from the assumption that feature sets must be minimal. With that assumption, a variety of generalized *ABA-like constraints are derived, among which the actual *ABA generalizations appear to be a special case. From this basic result, further restrictions are obtained by adding in the assumption that intersection is permitted (reducing the space of possibilities from 25 inventories and 10 paradigm sets to 3 inventories deriving 3 paradigm sets).

The effect of imposing Pāṇinian ordering as a condition on grammars (models) was also considered. With feature intersection, it proved overly drastic, excluding syncretism from the minimal valid inventories, but without feature intersection, the Pāṇinian restriction was weaker, excluding 3 of 10 paradigm sets admitted by intersection-free minimal valid inventories that incorporate the possibility of extrinsic order.

One specific result of interest to the study of *ABA generalizations is that the containment relationship among features, which is standardly invoked in accounts of *ABA generalizations in the literature turns out to be not only not the only type of inventory that can explain the generalization, but in fact, under the assumption that intersection is permitted, also not one of the minimal valid inventories.

In work in progress, we investigate additional extensions of the considerations presented here. In Appendix A, we begin the process of looking at larger paradigm spaces. As paradigms grow, the considerations become more intricate, but there may still be ways in which the minimal valid inventory stands as a contender for imposing restrictions that map to observed typological generalizations. In Sauerland & Bobaljik (2013), we note that for the four-cell paradigm space corresponding to the first person (inclusive vs. exclusive × singular vs. plural), 9 of the 15 possibilities are indeed attested (Cysouw 2003). The four cell space can be described as two intersecting binary features, but that inventory is not minimal. Rather, using intersection, a more minimal inventory is the one containing the three features in (43) (and thus allowing the intersection of the first two in the rules of exponence):

(43) a. f0101
  b. f0011
  c. f1111
  d. (f0101 ∩ f0011 = f0001)

While eschewing binary features, this yields a partition set that contains 9 of the B4 = 15 logically possible partitions of the four cell space (this corresponds to the first line of the third block in Table A3). If we map the lists in the partition set to a binary table as in (24), we we may observe that this partition set contains partitions corresponding to horizontal and vertical syncretisms, but no diagonal syncretisms. In Sauerland & Bobaljik (2013), we reached the conclusion on independent grounds that this was the optimal analysis of the first person paradigm space, i.e., the inventory that yields the best fit to the observed distribution of paradigms as documented in Cysouw (2003), while minimizing the incidence of accidental homophony. We turn to more discussion of larger paradigm spaces in Appendix A, below.

Without probing deeper, we hope to have shown that the derivation of *ABA generalizations entertained here may indeed get off the ground in some domains, leaving for future work the fuller empirical investigation of this approach.

Finally, returning to the question we raised at the outset, we may step back even further and ask why UG might have the types of constraints it does. We are obviously far from an answer, but can add a few, very tentative remarks here.

To this point, we have assumed that it is reasonable to think that UG feature inventories respect a condition of Minimality, and have shown how this assumption restricts the hypothesis space to be considered in determining the actual feature inventory corresponding to paradigms of a given size. Minimality has a somewhat different flavour than some of the other restrictive assumptions we have entertained. In principle, one could think of this from a different perspective. Rather than imposing a condition of Minimality on inventories, one could imagine instead that the features are whatever they are, but that UG shows maximal use of the features it has. For a domain with two features, UG generates in principle a three-celled space: each feature on its own, plus their intersection. This builds in the assumption of minimality – and thus means that all true three-celled paradigms are those projected from the two-feature inventories, yielding the *ABA prediction (up to linear permutation).

This alternative (maximal use of minimal resources), implies that there should be no four, or five-celled paradigms. If there are two features (in a given domain) then the maximal paradigm in that domain will have three cells. If there are three features, then the paradigms generated will have 7 cells. The appearance of a four-celled paradigm in some domain then necessarily involves syncretism.

This concludes the discussion of 2 and 3-cell paradigm spaces, and the connection with the *ABA patterns. As an appendix, we turn to a rather less in-depth investigation of the effects of the assumptions here on larger paradigm spaces, notably 4-cell paradigms.

Additional Files

The additional files for this article can be found as follows:

Appendix A

Beyond three cells. DOI: https://doi.org/10.5334/gjgl.345.s1

Appendix B

The 47 Four Cell Pāṇinian Partition Sets (PPSs). DOI: https://doi.org/10.5334/gjgl.345.s1