1 Introduction

Grammatical gender, or the sorting of nouns into groups, is a salient feature of many languages. Grammatical gender systems have long raised interesting questions about how we learn and represent generalisations and exceptions to those generalisations. On the one hand, in grammatical gender languages, all nouns are classified into a gender category, and gender concord with other elements in a noun phrase is obligatory, making gender a highly salient feature for learners to acquire and speakers to perceive and produce. On the other hand, assignment of a particular noun to a gender category is not necessarily predictable and seems to involve a mix of semantic and morphophonological cues. Kramer (2020: 46–47) argues that “[e]ach grammatical gender system has at least a subset of nouns whose gender is assigned based on animacy, humanness, and/or social gender for humans”, and thus that “grammatical gender systems always have a semantic core of nouns whose gender is semantically predictable”. Bantu noun classes (NCs) present a particularly rich domain in which to investigate whether such ‘core’ semantic features do indeed play a role in how native speakers produce and perceive their language. Bantu languages can have as many as 20 distinct NCs, many of which seem to have at least some basis in either inherent or evaluative semantics. Bantu NCs are conventionally named numerically and organized into singular/plural pairs, where for most nouns the singular is in the odd-numbered class and the plural in the neighbouring even-numbered class. A noun’s class membership is marked via a prefix on the noun (e.g., class 1 um-ntu 1-person ‘person’; class 2 aba-ntu 2-person ‘people’1).

Much has been written about the semantic basis, or lack thereof, of the NC divisions in Bantu languages (see e.g., Maho 1999; Demuth 2000; Katamba 2006; Morrison 2018). Attempts to characterize the classes semantically generally acknowledge that some classes have an identifiable semantic core – for example, class 1/2, which mostly contains nouns referring to humans – while others seem to be miscellaneous in nature. Table 1 presents a schema of the Bantu NCs and their proposed associated semantics (Msaka 2019: 43). We focus here on the Bantu language isiXhosa. The NC prefixes in Table 1 are for isiXhosa specifically; it also only depicts classes currently found in isiXhosa.

Table 1: Bantu NCs and proposed associated semantics (Msaka 2019). NC prefixes are for isiXhosa specifically.

Class NC prefix (singular/plural) Common semantics
1/2 um-/aba- Humans
1a/2a u-/oo- Kinship terms
3/4 um-/imi- Trees, plants, inanimates
5/6 i(li)-/ama- Miscellaneous
7/8 isi-/izi- Miscellaneous
9/10 i(n)-/i(z)i(n)- Animals
11/10 u(lu)-/i(z)i(n)- Abstract, long thin things
14 ubu- Abstract/mass nouns
15 uku- Infinitives

While noun class systems appear in languages that collectively have hundreds of millions of speakers, there are few empirical investigations of noun class representation and processing (Berghoff & Bylund 2025). Accordingly, there is minimal experimental evidence regarding the semantic basis, or lack of thereof, of the noun class divisions. However, some recent research has begun to fill in this gap. Most notably, in a pair of papers, Kanampiu, Martin & Culbertson (2025a, b) show (1) that the distributional statistics of Kîîtharaka NC morphophonology (prefixes and gender agreement) and noun-stem semantics point to only classes 1/2 and 3/4 being productively associated with an inherent semantic feature (human and tree, respectively2), and (2) in a wug-test type experiment, ‘human’ and ‘fruit’ were the only inherent semantic features to evoke consistent, expected, NC prefixes and gender agreement morphophonology.3 By contrast, in both their corpus analysis and production experiment, Kanampiu and colleagues find that morphophonological gender cues are highly productive and reliable. These results diverge from recent findings by Lawyer et al. (2024), who investigated whether speakers of Kinyarwanda make use of the NC prefix, the semantics of the nominal stem, or both, in a triadic comparison experiment. They found that participants were slightly more likely to group two nouns based on shared noun-stem semantics rather than shared prefix when the two cues conflicted (e.g., for the triad umu-kindo (‘palm tree’, NC3) — in-kura (‘rhinoceros’, NC9) — umu-hari (‘fox, wild animal’, NC3), participants judged inkura and umuhari to be more similar 7% more often than umukindo and umuhari). In earlier work also using a triadic semantic comparison task, Jonas (2018) found that isiXhosa speakers were at chance level in grouping two nouns sharing an NC prefix when presented with only pictures (no linguistic labels) in triads such as um-nqwazi (hat, NC3), isi-kere (scissors, NC7), isi-hlangu (shoe, NC7), and only 5% more likely to group two nouns sharing a class prefix when presented with pictures and written names together. Kanampiu et al. (2025a: 35) tentatively suggest that modern Bantu languages may “exploit these types of cues differently”, but it may be more likely that it is different tasks, rather than languages, that explain the variation.

We complement these recent investigations employing corpus, production, and sorting task methodologies with an auditory lexical decision task, in which we aim to gain insight into whether native isiXhosa speakers treat particular NCs as semantically specified by investigating how they respond to isiXhosa pseudowords in which an NC prefix is paired with a noun stem belonging to another class. These pseudowords, which we term “semantic violation items”,4 are created using two different NC prefixes, which contrast in the semantic coherence of their associated class. NC4 nouns are canonically inanimate, with relatively few exceptions, potentially allowing for a [-animate] generalisation to be formulated about nouns in this class. In contrast, while NC10 is sometimes reconstructed as the “animal” class, its present-day contents are semantically widely varied (Taraldsen et al. 2018), precluding a unifying semantic description.

For each NC prefix (NC4 and NC10), we contrast responses to the putative semantic violation items with responses to a syntactic violation condition, which consists of an NC prefix placed on a verb stem and therefore involves a clear type mismatch. By investigating whether the semantic violation items are judged differently to the syntactic violation items, we aim to shed light on speakers’ representations of NC information.

The paradigm we employ, with its contrast between two violation types, has been used in several previous studies (see Stockall & Manouilidou 2014; Manouilidou et al. 2016, Neophytou et al. 2018; and Cayado et al. 2024 on verbal affixation in English, Greek, Slovenian and Tagalog). Given this previous research, we predict that speakers should reject the syntactic violation items for both prefixes robustly. If participants treat the NC4 prefix as [-animate], this should lead to high rejection rates for semantic violation items that combine this prefix with a [+animate] stem, since there is a simple rule to apply. No such analysis is predicted to be possible for NC10 prefix items, as the wide semantic variety in NC10 should mean that there is no semantic generalisation about this class available for learners to acquire. To evaluate whether an NC10 prefix + noun stem is a licit word or not, speakers should have to consider each combination and only reject the novel word if they recall that the noun stem ought to take a different NC prefix, leading to lower rejection rates. Thus, the critical research question is whether participants treat the putative semantic violations formed with the two NC prefixes differently.

2 Method

2.1 Participants

The data from 90 native isiXhosa speakers (mean age 20.6 years, SD 2.1 years; range 18–27 years; 64 female, 24 male, 2 non-binary) were included in the experiment. The majority of them reported growing up in one of the two provinces in which isiXhosa is most widely spoken in South Africa (Eastern Cape n = 45; Western Cape n = 39; Gauteng n = 2; Mpumalanga n = 2; KwaZulu-Natal n = 2), and most currently lived in the Western Cape (n = 83; Eastern Cape n = 5; Gauteng n = 2). Participants rated themselves as highly proficient in isiXhosa (mean 8.7/10, SD 1.19; 0 = “none”, 10 = “perfect”), and 77 indicated that they had studied isiXhosa as a subject at school or university. Ethics approval was granted by the Research Ethics Committee: Social, Behavioural and Education Research at the first author’s institution (project number 26175). All participants gave informed consent, and they received monetary compensation for their participation.

2.2 Materials

Two NC prefixes were used to create the experimental stimuli: the imi- prefix of NC4, and the ii(n)- prefix of NC10. NC4 and NC10 are plural NCs, the NC prefixes of which are uniquely used for NC marking (and not also, for example, as subject concords, as is the case with some other NC prefixes).5 Furthermore, NC4 and NC10 differ in terms of semantic coherence. NC3/4 entities are canonically inanimate (e.g., natural entities such as trees, rivers, and body parts), with relatively few exceptions. In contrast, NC9/10 is often referred to as the “animal class” (Taraldsen et al. 2018), as it includes many highly frequent animal nouns, alongside a number of frequent human nouns (e.g., i-ndoda NC9-man ‘man’; i-ntombi NC9-girl ‘girl’). However, in synchronic isiXhosa, the contents of this class are eclectic, as it also includes many inanimate nouns of various types. Notably, NC10 is also the class in which most borrowed plural nouns occur.

The experimental stimuli comprised both words and non-words. The words were existing nouns in NC4 and NC10. The non-words belonged to two conditions (see Table 2 for examples). Putative semantic violation items were created by placing the NC4 and NC10 prefixes on NC1a/2a and NC5/6 noun stems, respectively. NC1a/2a contains primarily animate nouns (e.g., kinship terms). NC5/6 contains primarily inanimate nouns (e.g., food, plants, landscape and weather terms). Critically, all the NC1a/2a nouns and all but two of the NC5/6 nouns used in the experiment were animate and inanimate, respectively, in order to induce a mismatch between the possible preferences of the prefixes and the semantics of the stems. It should be noted that pairings of the NC4 prefix with human-denoting stems from NC1a/2a should be particularly bad because NC3/4 contains very few human nouns, with those that do occur in this class appearing to be stigmatized (e.g., um-gewu NC3-criminal ‘criminal’; um-gulukudu NC3-gangster ‘gangster’; Carstens, 2024).

Table 2: Illustration of conditions.

Grammatical Semantic violation Syntactic violation
Prefix: imi- imi-bhobho
‘4-pipe’
*imi-malume
‘4-uncle’
*imi-sela
‘4-drink’
Prefix: ii(n)- ii-nkomo
‘10-cow’
*iin-sango
‘10-gate’
*iin-hlala
‘10-sit’

We compared responses to the putative semantic violation items with syntactic violation items, which were created by placing the NC4 and NC10 prefixes on verbal stems. Thus, all the violation items were equally bad at the token level (each prefix + stem had a morpheme co-occurrence probability of 0), with the differences between conditions being located at the abstract type level (grammatical category mismatch vs. putative semantic feature mismatch).

The stimuli also included grammatical filler items (n = 52) belonging to NC7, NC8, NC11, and NC14, to ensure an equal number of grammatical and violation trials.

Stems were selected based on their frequency in a ~4-million-word isiXhosa corpus (Berghoff 2023), and experimental items were matched on syllable length across conditions. An initial set of 30 items in each condition were selected and recorded in Audacity (version 3.1.3; Audacity Team 2021) by a native isiXhosa speaker. All items were reviewed to ensure that non-words did not contain hesitations or other audible indications of their ungrammaticality. The items were then normed by 10 additional native isiXhosa speakers, who listened to each word and rated, on a scale of 1–7, whether it was a real isiXhosa word (1 = “disagree strongly”, 7 = “agree strongly”). Following norming, 26 items were retained in each condition, for a total of 156 experimental items. All grammatical items received a mean rating of over 5, and all violation items received a mean rating of under 3.

2.3 Procedure

The experiment was administered online via Gorilla (Anwyl-Irvine et al. 2020). An auditory (as opposed to a visual) lexical decision task was chosen based on potential variation among isiXhosa speakers in exposure to written isiXhosa. This decision was informed by results from a visual lexical decision study conducted in a similar context (Setswana in Botswana; Ciaccio et al. 2020), which suggest that variation in participants’ exposure to the written language may compromise data quality. In South Africa, while most children receive mother-tongue education for the first three years of schooling, isiXhosa is rarely used as an official medium of instruction beyond this point. From the age of 10, most isiXhosa speakers receive their schooling in English (Plüddemann 2015), and they may continue to use English for much of their reading and writing in adulthood. Thus, although our background data indicate that the participants were highly proficient in the language, with the majority of them having studied the language at school, the use of an auditory task mitigates concerns about modality influencing the results.

The task was divided into a practice round, a screening round, and the main experiment. Within each of these three phases, items were presented in random order, and participants were instructed to listen to each word and use the arrow keys to indicate whether what they heard was/was not a word in isiXhosa. At the beginning of a trial, a fixation cross (size: 120 pixels) was presented in the centre of the screen for 500ms. Subsequently, a volume symbol appeared in the centre of the screen and the audio began playing. At the offset of the audio stimulus, labels for the left and right arrow keys appeared on the screen in 50pt font (EWE = “yes”, HAYI = “no”; as shown in Figure 1). The mapping of “EWE”/“HAYI” to the left and right arrow keys was counterbalanced across participants. To ensure that participants heard the entire stimulus before giving their answer, they could only respond at audio offset. The screen timed out 2,000ms after audio offset, at which point the experiment automatically proceeded to the next trial. A progress bar was shown at the top of the screen.

Figure 1: Illustration of trial. Fixation cross shown for 500ms; volume symbol displayed while audio is playing; labels for left and right arrow keys displayed at audio offset (EWE = “yes”; HAYI = “no”). Timeout 2,000ms after audio offset.

The practice round consisted of five items. Participants then completed a ten-item screening task consisting of five words and five non-words. The screening task aimed to test general knowledge of isiXhosa morphosyntax; thus, words and non-words included a variety of licit and illicit affix–stem combinations. Non-words did not include the violation types targeted in the main task. During the practice and screening tasks, participants received feedback if they took too long to give their response: the text Ixesha liphelile (“Time’s up”) appeared in the centre of the screen, under the volume symbol, in 45pt font. Only participants who scored at least 7/10 on this screening task proceeded to the main task. The main experiment was divided into two blocks with a self-timed break in the middle. No feedback on answers was provided during the main experiment. Participants subsequently completed a short demographic questionnaire. The entire experiment took approximately 25 minutes to complete.

2.4 Analysis

The data were analysed in R (version 4.3.3; R Core Team 2024) using the tidyverse (version 2.0; Wickham et al. 2019), lme4 (version 4.1.1; Bates et al. 2015), lmerTest (version 3.1.3; Kuznetsova et al. 2017) and emmeans (version 1.10; Lenth 2016) packages. While both rejection rates and response time data were collected, we analyse only the rejection rate data because participants could respond only at audio offset, meaning they had the full duration of the stimulus to process the prefix violations and make their lexical decision. Moreover, prefixes were not spliced onto stems, resulting in variable timing of prefix offset across items and conditions. These design features make response time comparisons across conditions uninformative.

3 Results

The data from two participants who reported an age of 17, five participants who timed out on all trials, and three participants who scored below 80% on the grammatical filler items were removed from further analyses, leaving 90 participants in the final sample.

Grammatical filler items had a rejection rate of 3.8% (SE 0.003, 95% CI 3.2–4.3%) – that is, they were correctly accepted 96.2% of the time. Table 3 presents the mean rejection rate per condition for the experimental items.

Table 3: Mean rejection rate per condition with standard errors (SEs) and 95% confidence intervals (CIs).

NC4 prefix (SE; 95% CI) NC10 prefix (SE; 95% CI)
Grammatical 3.3% (0.004; 2.5–4%) 3.8% (0.004; 3–4.6%)
Semantic violation 87.4% (0.007; 86–88.8%) 81.8% (0.008; 80.1–83.4%)
Syntactic violation 90.2% (0.006; 89–91.4%) 91.8% (0.006; 90.6–92.9%)

All responses from the 90 participants whose data were retained were included in the rejection rate analysis. In this analysis, we include only the two violation conditions, as our main focus is on whether the difference in rejection rates across violation conditions – if any – differed across the two noun class prefixes (NC4 and NC10). We employed a logistic mixed effects regression model with Violation Condition (Semantic Violation, Syntactic Violation; sum coded as –0.5 and 0.5) and NC Prefix (4, 10; sum coded as –0.5 and 0.5) as fixed effects, as well as their interaction. The maximal random effects structure that would converge included random intercepts for participants and items and by-participants random slopes for Violation Condition and NC Prefix.6 Table 4 presents the model results.

Table 4: Rejection rate model results.

Term Estimate 95% CI Std. Error Z p
Intercept 2.65 2.38; 2.94 0.14 18.33 <.001
Violation condition (Semantic vs Syntactic violation) 0.81 0.40; 1.23 0.21 3.83 <.001
NC prefix (NC4 vs NC10) –0.06 –0.48; 0.35 0.21 –0.30 .76
Violation condition × NC prefix 0.90 0.11; 1.69 0.40 2.24 .02
Random effects
Variance S.D. Corr.
Participant (Intercept) 0.89 0.94
Item (Intercept) 0.87 0.93
Violation condition | Participant 0.15 0.39 0.48
NC prefix | Participant 0.26 0.51 –0.08

Rejection rates were significantly higher in the syntactic violation condition compared to the semantic violation condition, while they did not differ across the two noun classes. Additionally, there was a significant interaction between Condition and Noun Class prefix. Follow-up pairwise comparisons conducted using the emmeans package (Lenth 2016) showed that rejection rates did not differ significantly across the two conditions for NC4 (β = –0.36, SE = 0.29, z = –1.23, p = .22), while for NC10, rejection rates for semantic violation items were significantly lower than for syntactic violation items (β = –1.26, SE = 0.29, z = –4.3, p < .001). Estimated marginal means are plotted in Figure 2.

Figure 2: Estimated marginal means, rejection rates.

4 Discussion

In this study, we used an auditory lexical decision task to investigate the processing of novel NC prefix + stem combinations in isiXhosa, created using two NC prefixes (NC4 and NC10) that differ in the semantic coherence of their associated NCs. We contrasted responses in a putative semantic violation condition, where the NC prefix was placed on a noun belonging to another NC, to responses in a syntactic violation condition, where the NC prefix was placed on a verb stem. Our analysis revealed a significant interaction between NC and violation type. Follow-up pairwise comparisons showed that for the NC4 items, rejection rates did not differ across the semantic and syntactic violation conditions, with both types of item being rejected about 90% of the time, while for the NC10 items, rejection rates were significantly lower for the semantic than the syntactic violation condition (about 80% vs 90%). This finding suggests that NC10 + noun stem pseudowords were less likely to be perceived as violations of the prefix’s selectional restrictions than NC10 + verb stem pseudowords.

These results are consistent with the hypothesis that some isiXhosa noun classes (and Bantu noun classes more generally) are associated with semantic generalisations while others are not. The class 4 prefix imi- is rejected by native speakers just as robustly when attached to an animate noun stem as when attached to a verb stem, suggesting isiXhosa speakers represent imi- as requiring stems that are [-animate, +N]. By contrast, no such strong semantic generalisation about the distribution of the class 10 prefix ii(n)- seems to be available. Instead, speakers must rely on their familiarity with each of the noun stems they encounter after an NC10 prefix to judge whether it is an acceptable combination.

Note that several noun stems can occur in more than one NC (e.g., -ntu appears in class 1/2 as um-ntu ‘person’ and aba-ntu ‘people’ and also in class 14 as ulu-ntu ‘humanity, community’). Additionally, NC prefixes can also perform grammatical functions such as nominalisation (e.g., um-hamb-i ‘traveller’ from -hamb- ‘go’; Mletshe, 2017). As such, uncertainty about whether a particular NC prefix + noun stem is grammatical cannot be resolved just by knowing which NC the noun stem usually occurs in. Given that NC9/10 is the default class for borrowings, we would expect that novel NC10 items are particularly difficult to evaluate – and indeed we find a lower rejection rate for the NC10 semantic violation items.

This contrast between the two NC prefixes is notably consistent with the results of Kanampiu et al. (2025a, b), investigating Kîîtharaka, who likewise find that only a few NCs have corpus distributional statistics and evoke experimental patterns consistent with speakers having learned robust, deterministic semantic generalisations about them. This is particularly encouraging, as we were not aware of Kanampiu and colleagues’ work when designing our study, Kîîtharaka and isiXhosa are not closely related, and the research methods employed vary considerably between their project and ours.

The single-word auditory lexical decision task we present here has a significant advantage over the tasks employed by Kanampiu et al. (2025a) and Lawyer et al. (2024): the materials are quite simple to generate, and the experiment is simple to run, allowing for extensions to other NC prefixes in isiXhosa, any other Bantu language, or indeed to languages with other types of gender system, to test the robustness of the pattern. For example, semantic violation items could be constructed for NC2(a), which, like NC4, would be expected to have high semantic coherence, and NC8, which would be expected to have lower semantic coherence, and thus pattern with the NC10 items in our experiment.7 It would also be possible to vary the semantic coherence of the sets of noun stems employed. Here, we chose sets of noun stems that were either +animate or -animate to create the strongest possible mismatches with the putative semantic features of the NC prefixes. However, the more narrow semantic domains identified in the Kanampiu et al. (2025a, b) work, such as ‘human’, ‘tree’, and ‘fruit’, could also be investigated. The experiment we present here can therefore be seen as a template or blueprint for future comparative within- and across-language projects.

This paper expands the small but growing body of quantitative research on grammatical gender processing in NC languages, with results that align encouragingly with new corpus and language production work in other Bantu languages and lay the groundwork for future comparative cross-linguistic experimental work investigating NC and gender systems, ultimately shedding new light on how and when conceptual semantic generalisations are grammaticalized.

Appendix

Figure A1: Forest plot of participant-level random effects. Each dot represents an individual participant’s estimated random effect, with horizontal lines indicating 95% confidence intervals. The vertical line at 1 marks the overall mean. Points to the left of 1 (red) indicate participants with below-average estimates, and points to the right (blue) indicate above-average estimates. The first two panels show the random slopes for Violation Condition and Noun Class, which display little deviation around the mean. The third panel shows the random intercept for Participant, reflecting participants’ overall rejection rates.

Table A1: Output of model including self-rated isiXhosa proficiency as a covariate.

Term Estimate 95% CI Std. Error Z p
Intercept 2.73 2.44; 3.02 0.15 18.35 <.001
Proficiency 0.13 –0.08; 0.34 0.11 1.20 .23
Violation condition: (Semantic vs Syntactic violation) 0.82 0.38; 1.25 0.22 3.71 <.001
NC prefix (NC4 vs NC10) –0.02 –0.45; 0.42 0.22 –0.08 .93
Proficiency × Violation condition –0.11 –0.27; 0.05 0.08 –1.30 .19
Proficiency × NC prefix –0.06 –0.24; 0.12 0.09 –0.64 .52
Violation condition x NC prefix 0.93 0.10; 1.75 0.42 2.21 .03
Proficiency × Violation condition × NC prefix 0.19 –0.08; 0.47 0.14 1.39 .16
Random effects
Variance S.D. Corr.
Participant (Intercept) 0.87 0.93
Item (Intercept) 0.94 0.97
Violation condition | Participant 0.13 0.36 0.46
NC prefix | Participant 0.24 0.49 –0.13

Abbreviations

SM: subject marker

Data availability

The stimuli, data, and code are available at https://osf.io/gjz6e.

Ethics and consent

Ethics approval was granted by the Research Ethics Committee: Social, Behavioural and Education Research at the first author’s institution (project number 26175). All participants gave informed consent, and they received monetary compensation for their participation.

Funding information

This work was supported by the South African National Research Foundation under grant 138180 and the United Kingdom Economic and Social Research Council under grant ES/V000012/1.

Acknowledgements

The authors would like to thank Unathi Ngumbela for assistance with stimulus development, as well as the editor and reviewers for their helpful feedback on the manuscript.

Competing interests

The authors have no competing interests to declare.

Notes

  1. In glosses, numerals indicate noun class (number + gender). [^]
  2. The productiveness of the putative feature [human] was only reliable if non-pejorative human-denoting nouns such as kî-ana (child + class 7 prefix = ‘ugly child’) were factored out of the analysis. Additional evaluative semantic features such as augmentative, pejorative, and diminutive were also assessed as potentially productive. [^]
  3. The evaluative features augmentative, pejorative, and diminutive were also used reliably in this way. [^]
  4. This condition could equally be called ‘putative semantic violation items’, since it is precisely whether isiXhosa speakers interpret NC prefixes as having semantic features that is under investigation. [^]
  5. We could not use NCs whose prefixes are also used as subject concords (prefixes placed on the verb that show number/gender agreement with the subject of the sentence), as this would introduce ambiguity in the syntactic violation items, where the NC prefix is placed on a verb stem. The subject concords for NC4 and NC10 are i- and zi-, respectively, as exemplified in the sentences below:
      1. (i)
      1. Imi-gewu
      2. 4-criminal
      1. i-sela
      2. 4sm-drink
      1. ama-nzi.
      2. 6-water
      1. ‘Criminals drink water.’
      1. (ii)
      1. Ii-ngonyama
      2. 10-lion
      1. zi-sela
      2. 10sm-drink
      1. ama-nzi.
      2. 6-water
      1. ‘Lions drink water.’
    [^]
  6. See the Appendix for plots of individual variation in participant responses and a statistical analysis including isiXhosa proficiency in the model, addressing a reviewer concern about proficiency as a possible factor in the results. [^]
  7. While the izi(n)- prefix is also used with some nouns belonging to NC10, this is only the case for monosyllabic noun stems. Stems of two syllables and longer always take ii(n)-. Thus, occurrences of izi- with multisyllabic stems should be understood as belonging to NC8. [^]

References

Anwyl-Irvine, Alexander L. & Massonnié, Jessica & Flitton, Adam & Kirkham, Natasha & Evershed, Jo K. 2020. Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods 52(1). 388–407. DOI:  http://doi.org/10.3758/s13428-019-01237-x

Audacity Team. 2021. Audacity: Free audio editor and recorder. Retrieved from https://audacityteam.org/

Bates, Douglas & Mächler, Martin & Bolker, Ben & Walker, Steve. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. DOI:  http://doi.org/10.18637/jss.v067.i01

Berghoff, Robyn. 2023. Deriving lexical statistics for psycholinguistic research on isiXhosa. Journal of the Digital Humanities Association of South Africa 4(01). 1–7. DOI:  http://doi.org/10.55492/dhasa.v4i01.4442

Berghoff, Robyn & Bylund, Emanuel. 2025. Diversity in research on the psychology of language: A large-scale examination of sampling bias. Cognition 256. 1–17. DOI:  http://doi.org/10.1016/j.cognition.2024.106043

Carstens, Vicki. 2024. The grammar of gender: Insights from Bantu asymmetries of AGR with conjoined subjects. Unpublished manuscript, University of Connecticut.

Cayado, Dave Kenneth Tayao & Wray, Samantha & Chacón, Dustin Alfonso & Lai, Marco Chia-Ho & Matar, Suhail & Stockall, Linnaea. 2024. MEG evidence for left temporal and orbitofrontal involvement in breaking down inflected words and putting the pieces back together. Cortex 181. 101–118. DOI:  http://doi.org/10.1016/j.cortex.2024.08.010

Ciaccio, Laura A. & Kgolo, Naledi & Clahsen, Harald. 2020. Morphological decomposition in Bantu: A masked priming study on Setswana prefixation. Language, Cognition and Neuroscience 35(10). 1257–1271. DOI:  http://doi.org/10.1080/23273798.2020.1722847

Demuth, Katherine. 2000. Bantu noun class systems: Loanword and acquisition evidence of semantic productivity. In Senft, Gunter (ed.), Systems of nominal classification, 270–292. Cambridge: Cambridge University Press.

Jonas, Khanyiso. 2018. Perceived object similarity in isiXhosa: Assessing the role of noun classes. Stellenbosch: Stellenbosch University MA thesis.

Kanampiu, Peter Njue & Martin, Alexander & Culbertson, Jennifer. 2025a. Semantic and morphophonological productivity in the Kîîtharaka gender system: A quantitative study. Glossa: A Journal of General Linguistics 10(1). 1–33. DOI:  http://doi.org/10.16995/glossa.11755

Kanampiu, Peter Njue & Martin, Alexander & Culbertson, Jennifer. 2025b. Experimental evidence for semantic and morphophonological productivity in Kîîtharaka noun classes. Glossa Psycholinguistics 4(1). 1–45. DOI:  http://doi.org/10.5070/G6011.20527

Katamba, Francis. 2006. Bantu nominal morphology. In Nurse, Derek & Philippson, Gérard (eds.), The Bantu languages, 103–120. London: Routledge.

Kramer, Ruth. 2020. Grammatical gender: A close look at gender assignment across languages. Annual Review of Linguistics 6(1). 45–66. DOI:  http://doi.org/10.1146/annurev-linguistics-011718-012450

Kuznetsova, Alexandra & Brockhoff, Per B. & Christensen, Rune H. B. 2017. lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software 82(13). 1–26. DOI:  http://doi.org/10.18637/jss.v082.i13

Lawyer, Laurel A. & O’Gara, Fate & Ngoboka, Jean-Paul & van Boxtel, Willem S. & Jerro, Kyle. 2024. Meaning or morphology: Individual differences in the categorization of Kinyarwanda nouns. Glossa Psycholinguistics 3(1). 22: 1–27. DOI:  http://doi.org/10.5070/G6011226

Lenth, Russell. 2016. Least-Squares Means: The R Package lsmeans. Journal of Statistical Software 69(1). 1–33. DOI:  http://doi.org/10.18637/jss.v069.i01

Maho, Jouni F. 1999. A comparative study of Bantu noun classes. Gothenburg: Acta Universitatis Gothoburgensis.

Manouilidou, Christina & Dolenc, Barbara & Marvin, Tatjana & Pirtošek, Zvezdan. 2016. Processing complex pseudo-words in mild cognitive impairment: The interaction of preserved morphological rule knowledge with compromised cognitive ability. Clinical Linguistics & Phonetics 30(1). 49–67. DOI:  http://doi.org/10.3109/02699206.2015.1102970

Mletshe, Loyiso. 2017. Deverbal nominals derived from intransitive state verbs in isiXhosa: A generative lexicon approach. South African Journal of African Languages 37(1). 29–39. DOI:  http://doi.org/10.1080/02572117.2017.1316924

Morrison, Michelle E. 2018. Beyond derivation: Creative use of noun class prefixation for both semantic and reference tracking purposes. Journal of Pragmatics 123(1). 38–56. DOI:  http://doi.org/10.1016/j.pragma.2017.10.009

Msaka, Peter. 2019. Nominal classification in Bantu revisited: The perspective from Chichewa. Stellenbosch: Stellenbosch University dissertation.

Neophytou, Kyriaki & Manouilidou, Christina & Stockall, Linnaea & Marantz, Alec. 2018. Syntactic and semantic restrictions on morphological recomposition: MEG evidence from Greek. Brain and Language 183(4). 11–20. DOI:  http://doi.org/10.1016/j.bandl.2018.05.003

Plüddemann, Peter. 2015. Unlocking the grid: Language-in-education policy realisation in post-apartheid South Africa. Language and Education 29(3). 186–199. DOI:  http://doi.org/10.1080/09500782.2014.994523

R Core Team. 2024. R: A language and environment for statistical computing. Retrieved from https://www.R-project.org/

Stockall, Linnaea & Manouilidou, Christina. 2014. Teasing apart syntactic category vs. argument structure information in deverbal word formation: A comparative psycholinguistic study. Italian Journal of Linguistics 26(2). 71–98.

Taraldsen, Knut Tarald & Taraldsen Medová, Lucie & Langa, David. 2018. Class prefixes as specifiers in Southern Bantu. Natural Language & Linguistic Theory 36. 1339–1394. DOI:  http://doi.org/10.1007/s11049-017-9394-8

Wickham, Hadley & Averick, Mara & Bryan, Jennifer & Chang, Winston & McGowan, Lucy D’Agostino. & François, Romain & Grolemund, Garrett & Hayes, Alex & Henry, Lionel & Hester, Jim & Kuhn, Max & Lin Pedersen, Thomas & Miller, Evan & Bache, Stephan Milton & Müller, Kirill & Ooms, Jeroen & Robinson, David & Seidel, Dana Paige & Spinu, Vitalie & Takahashi, Kohske & Vaughan, Davis & Wilke, Claus & Woo, Kara & Yutani, Hiroaki. 2019. Welcome to the tidyverse. Journal of Open Source Software 4(43). 1–6. DOI:  http://doi.org/10.21105/joss.01686