One of the chief tasks that language users face when interpreting (di)transitive utterances is determining who did what to whom. To tackle this task, they often rely on four main morphosyntactic strategies (Malchukov et al. 2010; Lamers & de Hoop 2005; Lamers & De Swart 2012), viz. constituent order, nominal marking, verbal agreement, or prepositional marking. Apart from these morphosyntactic strategies, language comprehenders can usually also rely on semantic-pragmatic biases. That is, subjects and objects in transitive clauses have been shown to correlate cross-linguistically with opposing features like animacy or givenness: while agents/subjects are much more likely to be animate and discourse-accessible, themes/objects tend to be inanimate and more often discourse-new (Bornkessel-Schlesewsky & Schlesewsky 2009; Czypionka et al. 2017). Mahowald et al. (2022) confirm that English speakers are able to correctly identify subjects and objects in transitive events in about 90 percent of cases based on word meaning only, rendering any morphosyntactic marking of this contrast essentially superfluous.
If semantic-pragmatic biases usually suffice anyway, the fact that there are additional (and even multiple) morphosyntactic strategies performing the same function may then seem to go against a common assumption in linguistics, namely that grammar is organised in such a way that it facilitates efficient usage (e.g. Gibson et al. 2019; Hahn et al. 2020; 2021). Still, it may be the case that such redundancy is only present at the systemic level, but that in any given utterance, there is only one strategy at play, meaning a system does not exhibit syntagmatic redundancy (loosely following the definition proposed in Leufkens 2020: 83–84). There are indications that this is indeed frequently the case for the identification of argument roles in transitive clauses. For example, morphosyntactic strategies are often only applied in contexts where semantic-pragmatic information fails to disambiguate. This holds for phenomena such as differential object marking, where prepositional marking of the object argument is preferentially used in ambiguous, atypical or unpredicted contexts (e.g. when the object is animate).
On the other hand, languages have been shown to abound with redundant marking of relations within individual utterances even in cases where ambiguity is either clearly resolved by other strategies already, or where there is no clear advantage for language production (Van de Velde 2014; Levshina 2020; 2021; Tal et al. 2021; Tal & Arnon 2022). This can be explained by two main benefits of redundancy, viz. robustness against information loss, and learnability. The former entails that redundant marking prevents information from being easily lost in the noisy language channel (Van de Velde 2014; Winter 2014; Levshina 2020; 2021). Learnability means that the availability of multiple cues on how to interpret an utterance facilitates learning (Tal & Arnon 2022).
We then have two different perspectives to approach our own data from. On the former view, which we refer to as the ‘efficiency account’, morphosyntactic marking on a syntagmatic level should be applied as efficiently (and thus as sparingly) as possible. Following the latter view, which we will here call the ‘robustness account’ for short, syntagmatic redundancy should be prevalent, without constraints. The goal of this study is to test both accounts from a comparative and diachronic perspective. More specifically, we investigate the use of multiple strategies to distinguish agents and recipients in transfer-events, e.g. with verbs of giving as in (1), by comparing Present Day English to Present Day Dutch, and by tracking the historical development of English.
- They gave us cake.
We choose this problem, viz. this particular case of argument disambiguation, as it lets us investigate formal redundancy without taking into account semantic-pragmatic biases: both agents and recipients are prototypically animate, sentient and volitional (e.g. Newman 1998; Naess 2007; Haspelmath 2015). By contrast to the distinction between agents and themes or also between themes and recipients (Sedlak 1975: 125; Kittilä 2006: 292; Malchukov et al. 2010: 10; and most recently Mahowald et al. 2022), disambiguating these roles and determining who gave (what) to whom is therefore more crucially based on morphosyntactic strategies. We choose Dutch and English as they are known to employ the same morphosyntactic strategies but to differing extents (with e.g. constituent order being almost entirely fixed in Present Day English, but more flexible in Dutch), which we predict also affects the extent and specific distribution of redundant marking. Historical English is included as further point of comparison, this time diachronic, since the strategies have seen substantial change over time (with e.g. constituent order having rigidified over time).
The main contribution of this paper is that it allows us to shed further light on a crucial question in language research, viz. how redundant language really is, on the basis of both synchronic, comparative and diachronic data. Specifically, the paper gives insights on, on the one hand, the precise use of redundancy in two related languages, English and Dutch, as well as fluctuations in the degree of redundancy in historical English. Furthermore, it provides a detailed look at the particular strategies involved in argument marking in these languages and stages, which is of relevance to research into both languages. On the other hand, the study addresses questions relevant to a wide range of linguistic discussions: our investigation tests claims such as Sadock’s (2012: 225), who posits that “[r]edundancy is in fact a fundamental feature of the design of language”, against empirical data, assessing whether redundancy is as wide-spread as often assumed, and serves an important purpose (i.e. robustness), or whether there is more evidence for counter-proposals, suggesting that redundancy is inefficient and should thus be avoided. Overall, the present study adds to the growing body of research in usage-based linguistics as well as complex adaptive systems approaches, where language representation in speaker minds and the population level as a whole is assumed to be directly shaped and influenced by language use on the individual level – in our case, the tolerance for or aversion to potential ambiguity in individual instances and contexts of language use is presumed to be a driving factor in determining the degree of redundant strategy use in the entire system.
The paper is structured as follows. Section (2) provides the theoretical background. It first introduces redundancy and its (dis-)advantages in more detail (2.1), before turning to argument disambiguation, specifically agent-recipient disambiguation in (historical) English and Dutch (2.2). This gives the backdrop and main agenda for the corpus studies on transfer-events presented in the sections following: Section (3) presents the comparative study, investigating the use of the four morphosyntactic strategies used in Present Day English and Present Day Dutch. Section (4) then describes the historical study, which tracks the use of the four strategies in the history of English. Section (5) discusses the implications of our findings in light of the competition between efficiency and robustness. Finally, Section (6) concludes the paper.
2 Redundancy in agent-recipient disambiguation
2.1 Systemic and syntagmatic redundancy between efficiency and robustness
As already mentioned in Section (1) above, one basic distinction to be made in regard to redundancy is that between systemic and syntagmatic redundancy. The former is a typical trait of degeneracy, a phenomenon typically found in complex adaptive systems, and thus also in language (Van de Velde 2014; Winter 2014; Monaghan 2017). Degenerate systems are characterized by “the ability of elements that are structurally different to perform the same function or yield the same output” (Edelman & Gally 2001: 13763). Importantly, degenerate systems exhibit many-to-many relationships rather than redundancy in the strictest sense, as any strategy typically fulfils more than one function in the system, and any function is fulfilled by more than one strategy. For example, there are two distinct ways to form the past tense in most Germanic languages, by means of ablaut, e.g. sing ~ sang, or suffixation, e.g. kick ~ kicked, with the respective outputs not differing in function. At the same time, ablaut is not restricted to marking past tense, but can also be used for other relations.
The second main type of redundancy is syntagmatic redundancy, which refers to redundancy as “the repetition of information” (Leufkens 2020: 81). This involves using different strategies simultaneously, in the same utterance or element, to perform the same function.1 This type of redundancy is not necessarily present in a degenerative system. In other words, a given language may feature several strategies to express a specific relation, and thus constitute a redundant/degenerate system, even if the respective strategies are rarely or never combined in the same utterance. For instance, strategies of forming a past tense are hardly ever combined in the same word in Dutch, with double forms such as begin ~ begonde, lit. ‘begin ~ began-ed’ being a clear exception (De Smet 2021: 83).
As for the potential costs or benefits of any kind of redundancy, whether systemic or syntagmatic, there are arguments to assume that it should be globally dis-preferred. A large body of research demonstrates that evolutionary pressures such as efficiency in communication (largely used as synonymous with production economy here) shapes language, and may be responsible for many linguistic universals (e.g. Christiansen & Chater 2008; Bybee 2010; Culbertson & Kirby 2016; Gibson et al. 2019). A prime case of the seeming impact of efficiency, elaborated on in e.g. Fedzechkina et al. (2012; 2017), is that of trade-offs between strategies of argument disambiguation like case marking or constituent order. There appears to be a tendency to avoid systemic redundancy in argument disambiguation in that the presence of one strategy in a language system presumably often correlates with the absence of another. Such inverse relations are cross-linguistically common: for example, English or Mandarin Chinese have fixed constituent order, but virtually no case marking, whereas Latin, which featured an elaborate case marking system, was highly flexible in ordering (e.g. Sinnemäki 2008; 2010; 2014). The emergence of such trade-off scenarios can also be observed in the diachrony of many languages; historical evidence suggests that the loss of one strategy is often compensated for by another, or vice versa, that the strengthening of one strategy goes hand in hand with a weakening of others (Siewierska 1998; Koplenig et al. 2017). The history of English is an often-cited illustration of such a development – earlier English used to be largely reliant on inflections and had comparatively free ordering, whereas the opposite holds today (e.g. Baugh & Cable 1993). Artificial language experiments such as those reported on in Fedzechkina et al. (2012; 2017) support the assumption of strategy trade-offs as quasi-universals fueled by a preference for efficient and thus presumably non-redundant communication.
Still, there clearly exist exceptions to such tendencies, like Modern Icelandic, which has preserved a relatively intact case system, but also has comparatively fixed constituent order (Maling & Zaenen 1990). Furthermore, and perhaps more crucially, these trade-offs are rarely absolute, or even pervasive in the first place. Specifically, the recent typological studies presented in Levshina (2020; 2021) cast doubt on straight-forward conceptualisations of trade-offs, and indicate that correlations are typically more complex, including not necessarily being bi-directional. Indeed, it seems that “[t]he only thing disfavoured by […] languages is the absence of any cues” (Levshina 2020: 73), but multiple (and thus redundant) strategy use is prevalent on a systemic level (also Hengeveld & Leufkens 2018). Proponents of a positive view on systemic redundancy and degeneracy in diachrony may furthermore point out that different strategies often remain present in languages for an extended time rather than being completely replaced (cf. Van de Velde 2014). For instance, English still distinguishes between subject and object form for most of its personal pronouns, in spite of its constituent order becoming largely fixed.
Instead of straightforward replacement, perhaps it is then more accurate to say that strategies may come to be drawn upon to different degrees, and that strategy trade-offs are probabilistic rather than categorical. In other words, a more nuanced version of this argument is that systemic redundancy is expected as the norm, but that syntagmatic redundancy should still be avoided, or at least used only when directly purposeful. This is in line with previous research into the advantages of systemic redundancy (or more accurately, degeneracy), viz., system robustness and evolvability (Van de Velde 2014; Winter 2014). System robustness means that if for some reason, a specific strategy cannot be used, the system does not break down. For instance, a loanword such as Dutch liken ‘to mark something as liked on social media’ is not directly amenable to any of the Dutch ablaut classes, due to the English [aɪ]-diphtong not being present in (Standard) Dutch. Still, Dutch speakers can form a past tense from it, by employing the suffix -te. Meanwhile, evolvability means that a language is more adaptable to changes in its environment. A sudden influx of L2-speakers may e.g. be problematic for a language that solely employs an elaborate case system to mark argument roles, as it may be difficult for the new speakers to acquire the system rapidly. By contrast, a language that can employ both a case system and constituent order can more readily accommodate for the new speakers.
Even if systemic redundancy may thus have clear advantages and is expected to be predominant, syntagmatic redundancy, by contrast, can be argued to come with disadvantages “as it violates the principles of economy and transparency” (Leufkens 2020: 82; also e.g. Van Everbroeck 2003: 3). That is, redundant marking is often taken to be dis-preferred due to adding complexity and needing greater effort on the side of the language producer (cf. Frank & Jaeger 2008; Sinnemäki 2009; Lupyan & Dale 2010; Kurumada & Jaeger 2015; Leufkens 2015). Furthermore, redundant morphosyntactic marking has been said to also increase comprehension issues (Tanner & Bulkes 2015: 1755; also Montgomery 2000; VanPatten 2004; Caballero & Kapatsinski 2015). A general avoidance of redundancy in syntagmatic strategy use is supported by phenomena such as differential object marking in various languages, where prepositional marking is only drawn on when other strategies yield unclear cues (e.g. Tal et al. 2022; Levshina 2021), or instances of ‘word order freezing’, where fixed constituent order is supposedly only employed in cases of ambiguous case marking (e.g. Flack 2007; Mahowald 2011; Bouma & Hendriks 2012).
To sum up, the first main hypothesis to be assessed in this paper is that while we expect systemic redundancy as the default for all languages and language stages investigated, we anticipate redundancy in the same utterance to be either absent or very limited. For our contrasting hypothesis, we pursue the viewpoint that both systemic and syntagmatic redundancy are beneficial to language. We therefore expect syntagmatic redundancy to be commonplace. Two crucial advantages of syntagmatic redundancy are robustness against information loss, and learnability. Robustness against information loss entails that if a signal is defective, which is often the case in the noisy channels of natural language, there is a greater chance that the information contained in the signal nonetheless reaches the receiver intact (Fedzechkina et al. 2012: 17897; Levshina 2021: 3). For example, if background noise perturbs the acoustic signal in a way that a comprehender is unable to distinguish a subject and object pronoun form, another strategy like verbal agreement could be drawn on to disambiguate, thus ensuring robust transmission. Note that such noise is not just present in spoken language, but in any channel of natural language – or in fact, in any channel transmitting information in the physical world. For written language, sources of noise include typos, bad printing quality or imperfect eyesight, while for sign language, they may include sore muscles and visual clutter.
In addition to robustness, syntagmatic redundancy has also been shown to be advantageous in the context of learnability: redundant marking appears to increase the learnability of individual features and entire languages, as well as the speed and accuracy of learning (Tal et al. 2021; 2022; Tal & Arnon 2022; also Kempe & Brooks 2001; Bahrick et al. 2004; Taraban 2004; Yoshida & Smith 2005; Sloutsky & Robinson 2013; Monaghan 2017). Recent evidence furthermore indicates that redundant cues are particularly frequently used and helpful with non-native speakers of languages (e.g. Gibson et al. 2019; Tal & Arnon 2022). This directly correlates with the assumption of redundancy as a facilitator in contexts where understandability is reduced – both properties of the environment (such as a noisy background) and properties of the comprehender (e.g. mastery of the language, sense of hearing) may prevent or hinder information transfer. Additionally, there is evidence that syntagmatic redundancy does not in fact increase comprehension difficulties, as mentioned above, but instead mostly decreases them. The reason would be that syntagmatic redundancy makes utterances more salient, distinctive, and hence more processable for the comprehender (e.g. Nichols 2009; Gibson et al. 2013b; 2019).
There are essentially two routes through which the advantages of robustness and learnability could induce syntagmatic redundancy in language use, or the advantage of efficiency could lead to a lack of syntagmatic redundancy, mutatis mutandis. In the first, which we call the ‘ad-hoc’ route, language producers would favour producing utterances that are redundant to a high degree, and would therefore employ additional disambiguation strategies in cases where that redundancy is lacking. This route would hence involve the use of disambiguation strategies to be optional for the language producer. In the second route, which we call the ‘systemic’ route, the advantages would put an evolutionary pressure on the language system to develop in such a way that a high degree of redundancy is always guaranteed in actual language use. For instance, the case system would collapse in such a way that the other strategies are already set to take over (van Trijp 2013). This second way would not involve any optionality in the use of disambiguation strategies on the part of the language producer, as when to use which strategies would be determined by the system. While the present study is not directly focused on comparing both routes, we will deal with this matter when comparing English to Dutch, as the Dutch system allows the language producer more freedom in whether to employ specific strategies (see below).
In sum, the present study largely takes systemic redundancy as a given, and mainly focuses on the question of whether syntagmatic redundancy is prevailing in language, and whether different languages show varying degrees of syntagmatic redundancy. Following a decidedly usage-based, complex adaptive systems approach to language (e.g. Steels 2000; Beckner et al. 2009; Bybee 2010; Diessel 2019; Schmid 2020), we furthermore assume that syntagmatic redundancy may be driven by properties relating to individual language users and pressures acting on individual language users, which shape the properties of the linguistic system as a whole. Before we present our study and its results in the latter part of the paper, the following section briefly provides some more detailed background for the test case at hand, viz. agent-recipient disambiguation in Present Day English and Dutch, as well as historical English.
2.2 Strategies for agent-recipient disambiguation in English, Dutch, and the history of English
Our case study to assess the plausibility of the efficiency versus the syntagmatic redundancy account is morphosyntactic redundancy in participant role marking in Present Day Dutch and English ditransitive clauses, as well as in historical English. More precisely, we investigate the interaction between strategies used to distinguish agents and recipients in transfer-events, e.g. with verbs of giving as in (2), in Present Day English, Present Day Dutch, and in Middle English.
- They give some cake to the student.
In order to perform the function of argument role disambiguation, users of these languages have the following four morphosyntactic strategies at their disposal, evidencing systemic redundancy:
(i) constituent order, i.e. the agent and recipient are put in a strict order,
(ii) nominal marking, i.e. case marking, or distinct subject vs object pronoun forms,
(iii) verbal agreement, i.e. the verb agrees with the agent, but not with the recipient,
(iv) prepositional marking, i.e. the recipient is introduced by a preposition.
In Present Day English, constituent order is near-categorically fixed to SVO, and position can therefore almost always be drawn on for disambiguation. For instance, in (3), only constituent order is at play, with the first argument being identified as the agent, and the post-verbal element as the recipient due to ordering principles. In this utterance, none of the other three strategies distinguishes between agent and recipient at all. By contrast, the sentence in (4) features two strategies; in addition to constituent order, nominal marking, or rather, pronominal marking, signals the distinction. These two examples also showcase the extent to which nominal marking can disambiguate in Present Day English: as illustrated in (3), English today does not have a productive case marking system, in that nominal arguments are never distinguished by morphological form in any context. However, formal contrasts are found in parts of the pronominal system, in that all personal pronouns other than 3rd person singular neuter it and 2nd person singular/plural you differ in form depending on whether they are used in subject or object function. Accordingly, in (4) the agent she is not only identifiable based on position in the clause, but can also be distinguished from the recipient because it appears in subject form.
- The lecturer gave the students cake.
- She gave the students cake.
Verbal inflection or agreement also plays a role in some instances. In (5), the form of the verb disambiguates agent and recipient, as it needs a 3rd person singular NP-argument to agree with. Since only she but not the students fulfils this requirement, the subject can be unambiguously determined. By contrast, no cue from inflection can be gathered from the verbs in (3)–(4), as either of the arguments agrees with the past tense form gave. In addition to illustrating how verb inflection disambiguates, example (5) furthermore reflects three-fold redundant strategy use. While (3) only features one strategy, two strategies distinguish agent from recipient in (4), and three do so in (5).
- She gives the students cake.
Finally, recipients in Present Day English may be introduced by the preposition to, as in (6), where all four strategies are at play. The distribution of prepositional versus nominal recipients and the factors guiding this variation, a phenomenon commonly referred to as the dative alternation, has received much attention in the literature. Studies such as Bresnan et al. (2007), among many others, demonstrate that the presence of the preposition is often lexically conditioned, but also affected by semantic and information-structure related features of the objects, such as animacy, givenness or length.
- She gives cake to the students.
Importantly, prepositional marking is the only disambiguation strategy where the English language system truly allows the language users some optionality in whether or not to apply it. If the agent referent is 3rd person singular, the verb has to be marked for person in the present tense, otherwise the sentence is rendered ungrammatical (at least in Standard English). Likewise, if the recipient is expressed as a pronoun, using its subject form is not acceptable, and the language user does not get to actually make a decision on whether to use nominal marking. Arguably, language users do have some flexibility in using different (roughly equivalent) tenses, or choosing whether to use a pronoun or a full nominal form, and thus to employ the disambiguation strategies of verbal agreement and nominal marking, respectively. Furthermore, language users could choose to produce sentences that would be considered ‘ungrammatical’ by the requirements of the standard language (cf. e.g. the absence of 3rd person singular -s in African American Vernacular English; Wolfram & Thomas 2002). Still, in general, the English language producer has only a very limited say in the degree of syntagmatic redundancy present in their utterances, and the main way for robustness or efficiency to affect the degree of syntagmatic redundancy in English would be the ‘systemic’ route.
In sum, while constituent order is always at play in unmarked Present Day English clauses, the other three strategies, viz. nominal marking, verbal agreement and prepositional marking, do not need to be, and only the use of the last strategy is optional to the language producer in the strictest sense. By using this case study, we can then first of all test whether syntagmatic redundancy is common in English today. Under the efficiency view, we expect that instances that employ four strategies such as (6) are exceedingly rare, and only few strategies are employed simultaneously. By contrast, under the robustness account, we expect instances such as (5)–(6), which feature multiple strategies, to be the default. Second, the case study allows us to investigate whether syntagmatic redundancy is more frequent in certain contexts which may benefit its use. Specifically, in the present study, we test the impact of syntactic complexity, measured in terms of sentence length, on redundant marking. We argue that such environments are exactly where robustness against information loss, the prime advantage of syntagmatic redundancy, is crucial, as they present challenges to language users similar to noisy channels or learner(-directed) discourse. That is, there is ample evidence that greater syntactic complexity of a clause or constituent results in greater cognitive processing, and can influence linguistic choices (Rohdenburg 1996; see also Levshina 2018; Pijpops et al. 2018 and the references therein for an overview). We here choose to focus on sentence length in number of words as a proxy for syntactic complexity, following e.g. Szmrecsanyi & Kortmann (2012: 24), who discuss “length of selected linguistic units” as indicating ‘absolute-local’ complexity, based on the assumption that “more is more complex” (cf. also De Sutter 2009; Ortega 2012; Bloem et al. 2017 on sentence length as a determinant of syntactic complexity, as well as McWhorter 2001; Dahl 2004; or Miestamo 2008 on absolute complexity as one measure to describe the complexity of entire language systems). What is important is that, although there are many different ways of approaching syntactic complexity from both a theoretical and methodological perspective, and sentence length may be considered an insufficient (and rather crude) measurement, Szmrecsanyi (2004: 1037) in his empirical assessment of various operationalisations of the notion of complexity, finds that “measuring sentence length […] do[es] an excellent job in approximating a node count […], the structural measure of syntactic complexity which is probably the most ‘real’ one cognitively”. We take this result as a starting point for our investigation, arguing further that greater sentence length as such (independently of whether length correlates with complexity or not) should place a burden on speakers and listeners alike, in terms of increased effort in production and interpretation when dealing with a long sentence. In such contexts, we expect redundancy in argument disambiguation strategies to aid (and simplify) processing. This effect should also be present in written language: while writers are sometimes able to spend a considerable amount of time on a particular sentence, and readers may likewise be able to read a sentence very slowly, and go back over a single sentence a number of times, it should arguably still be in the interest of either to proceed in an as efficient and least time-consuming way as possible. Multiple, redundant cues that help facilitate the processing of a long, complex sentence may accordingly be favoured in both spoken and written mode. As will be discussed in more detail below, we do not consider redundant marking to necessarily lead to greater sentence length, since neither case marking, constituent order, and agreement have any effect on the number of words in a sentence, but only impact the order or internal make-up of the constituents involved. The only strategy whose use adds a word to the sentence is prepositional marking – to avoid circularity in this measure a priori, we therefore exclude the agent, theme and recipient constituents from our counts.
An interesting comparison to English in regard to redundant marking and our two hypotheses is Present Day Dutch. On the one hand, Dutch is closely related to English, and features the same four strategies to disambiguate agents from recipients. On the other hand, it is crucially different in that constituent order is less rigid than in English, and does not guarantee disambiguation for all instances. At the same time, its strategies of nominal marking and verbal agreement are more elaborate. For one, the second person singular pronoun can formally distinguish between subject and object form in Dutch (jij ~ jou ‘you’), while it cannot do so in English. For another, where English only has three forms of finite verbs (e.g. give, gives and gave for give), Dutch has five forms of finite verbs (e.g. geef, geeft, geven, gaf and gaven for geven ‘give’). We can then hypothesise that the lesser use of Dutch constituent order for disambiguation compared to English is compensated at the syntagmatic level by its stronger potential for nominal marking and verbal agreement. While we expect Dutch to be less reliant on constituent order than English, we expect nominal marking and verbal agreement to disambiguate between agent and recipient in more instances than in English.
This is not a forgone conclusion, as Dutch differs from English in that it allows its language producers more freedom in whether or not to apply a certain strategy. While Dutch offers the possibility to distinguish between subject and object form for the second person singular pronoun through the full forms of the pronouns, language users may also employ the reduced pronoun je ‘you’ or the polite form u ‘you’ which do not distinguish between subject and object form. Moreover, in contrast to English, Dutch features no distinction between subject and object forms of the reduced pronoun for the feminine third person singular and the third person plural, i.e. ze, which can mean ‘she’, ‘her’, ‘they’ or ‘them’. Still, Dutch language users can, again, distinguish through the full forms zij ‘she/they’, haar ‘her’ and hun ‘them’. As such, it is quite possible that Dutch language use exhibits less disambiguation through nominal marking than English, rather than more.
For a further example of optional disambiguation in Dutch, consider the fully ambiguous sentence (7): both mijn baas ‘my boss’ and je ‘you’ can be agent or recipient. By contrast, in (8), the sentence is disambiguated through constituent order. The same effect can be achieved by nominal marking, as is done in (9). While in (7), the reduced pronoun je ‘you’ can act as both subject and object, the full pronouns jij and jou here differentiate between the two. Alternatively, language producers may use verbal agreement, as illustrated in example (10). The finite verb form kan ‘can’ in (7) agrees both with second and third person singular, but kun ‘can’ in (10) is unambiguously second person. Finally, the sentence in (11) showcases that the preposition aan may also be employed to mark the recipient.
- ‘My boss can’t just give you a telling-off tomorrow.’ or ‘You can’t just give my boss a telling-off tomorrow.’
- ‘You can’t just give my boss a telling-off tomorrow.’
- ‘You can’t just give my boss a telling-off tomorrow.’
- ‘You can’t just give my boss a telling-off tomorrow.’
- ‘You can’t just give my boss a telling-off tomorrow.’
Still, while this freedom in applying strategies is greater than in English, it is not omni-present. For instance, constituent order has to be used in subordinate clauses such as (12); nominal marking has to be used when the agent or the recipient is the first person, like in (13); verbal agreement has to be used when the agent is singular and the recipient is plural, as in (14); and prepositional marking cannot be used when the meaning of the utterance entails a change of state in the recipient (Broekhuis et al. 2013: 520).
- ‘I see that you are giving my boss a telling-off.’
- ‘I’m giving you a telling-off.’
- ‘My boss is giving them a telling-off.’
To conclude, Dutch language producers generally have more freedom in tuning the redundancy of their utterances than their English counterparts, at least when it comes to agent-recipient disambiguation. In other words, syntagmatic redundancy in Dutch would be affected to a larger degree via the ‘ad-hoc’ route than in English. We hence expect the degree of syntagmatic redundancy to be more varied in Dutch sentences than in English ones, as Dutch language users have more leeway to adjust the degree of redundancy to the requirement of the specific situation. Concretely, we expect English sentences to generally exhibit the same, consistent degree of redundancy, whereas this would not be the case for Dutch sentences – or at least less so.
Besides the typological (synchronic) comparison between Present Day English and Present Day Dutch, a further interesting point of comparison is presented by the diachrony of ditransitives in English, as their history features a large amount of change over time concerning all disambiguation strategies outlined. First, constituent order of all arguments and verbs was considerably more flexible in earlier English (e.g. the relevant contributions in Nevalainen & Traugott 2012). This is still evident in the 14th century example in (15), where the recipient precedes the verb, and the agent is given in clause-final position. Since then, clause constituent order has come to be increasingly fixed, eventually leading to categorical SVO as seen today (Trips 2002; Hawkins 2012). For ditransitives specifically, this means that although there is still variation in the order of objects, the recipient now typically comes after the verb, and is thus readily distinguishable from the pre-verbal agent argument.
- spirit of wit.
- spirit of wit
- ‘The spirit of wit teaches you that.’
- (1390; CMEDVERN,247.333)
Regarding inflection, it is well known that from Middle English onwards, the system of nominal and verbal inflection was greatly reduced (e.g. Allen 1995). In Old English, NP-arguments could still be distinguished based on case marking: for example, agents typically appeared in the nominative, while recipients were frequently associated with dative case, as in (16) (De Cuypere 2015a). However, from Middle English onwards, nominal case gradually disappeared and was eventually lost, and remnants of morphological form distinctions only persist in the pronominal system today. Similarly, the system of verbal inflection used to be more complex in earlier English, featuring a larger number of formal contrasts according to person and number in all tenses.
- ‘He gave his faithful the spiritual grace.’
- (ÆCHom_II,43:319.44.7210; De Cuypere 2015a: 231)
Finally, Middle English also saw the increasing use of prepositions to mark semantic relations, such as to introducing prototypical recipients of transfer-events (McFadden 2002; De Cuypere 2015b; Zehentner 2019). The fact that all these changes coincide in timeframe has meant that they are often treated as causally related and are often viewed as part of English undergoing a typological move from a more synthetic to a more analytic language (e.g. Baugh & Cable 1993: 60; also Hawkins 2012; Szmrecsanyi 2012). As already mentioned above, this development is also frequently taken as a prime case of a trade-off between disambiguation strategies. Corresponding to supposed typological correlations in absence or presence of individual strategies, the diachronic trajectory of English (with a decrease in one strategy correlating with an increase in other strategies) would seem to support the assumption of languages striving for efficiency rather than redundancy. However, as laid out already, such straightforward trade-off scenarios have also been questioned, since no sweeping replacement can be observed in historical data, but evidence instead points towards more complex, probabilistic changes (see e.g. Pintzuk 2002 for English; also Levshina 2020; 2021).
We include Middle English in our study as representing a system in flux, which makes it akin to both of the present day languages English and Dutch. While early Middle English strategy use is variable to a large degree (thus more resembling Dutch), late Middle English is already becoming more similar to Present Day English. Using data from all three languages/stages therefore allows us not only to compare redundancy across languages, but also provides us with insights on the diachronic development of redundancy in morphosyntactic marking. Under the efficiency account, we expect a strict trade-off between the strategies, whereby as constituent order becomes more popular, the use of other strategies markedly declines. By contrast, under the robustness account, we anticipate that a high degree redundancy, e.g. triple marking, is guaranteed at all stages despite a system-internal reshuffling taking place, and perhaps only slightly declines in the latter stages as SVO has become obligatory anyway and therefore takes away any risk of ambiguity.
3 Redundancy in Present Day English and Dutch
In this section, we first assess the two opposing accounts introduced above from a synchronous, comparative perspective, using data from Present Day English and Present Day Dutch. Our English data are taken from a publicly available dataset of ditransitive instances compiled by Röthlisberger (2018) from the International Corpus of English (ICE; Greenbaum & Nelson 1996) and the corpus of Global Web-based English (GloWbE; Davies 2013), which contains both spoken and written data. We have made use of both spoken and written data because information transfer through spoken and written data are both prone to noise. In fact, ensuring robustness of information transfer is arguably even more important in written than in spoken language, because the language producer is typically not present during reading, and hence, no repairs are possible. In other words, the reasoning presented in Section (2) applies both to written and spoken language. To enhance comparability across time, the Present-day English data was restricted to British English (i.e. to attestations in the ICE-GB and GloWbE-GB).
We know from previous work on the dative alternation in both English and Dutch that the choice of verb has an impact on at least one of the four disambiguation strategies, viz. prepositional marking (Bresnan et al. 2007; Colleman 2009). Since we want to exclude this as a potential confound, we limited the data to the English verb give and the Dutch verb geven ‘give’. This verb was chosen because it is one of the most frequent verbs expressing transfer, and it is certainly the best understood, with several studies focusing exclusively on this verb (e.g. Bresnan & Hay 2008; Bernaisch et al. 2014). Moreover, we excluded passives, as well as all tokens which did not have three explicit arguments (meaning an explicit agent, recipient, and theme argument).
This resulted in a total of 395 tokens in the English dataset, which were subjected to further analysis as follows. We categorised the retrieved clauses for the strategies instantiated in them by annotating the dataset with four binary measures that indicate whether or not each strategy disambiguates the agent from the recipient. As explained above, constituent order always disambiguates in this dataset – topicalisation of the recipient is not attested, but constituent order would arguably still provide reliable cues even in such cases (cf. The students the lecturer gave cake). As for preposition marking, the tokens were simply coded according to the presence or absence of the preposition to marking the recipient. For the two measures relating to inflectional disambiguation power, viz. the cue reliability of morphological form of the NP-object or the NP within the PP-recipient, and the disambiguation power of subject-verb agreement, we proceeded along the following lines:
For nominal marking, our classification builds on observed differences in morphological form as discussed above, drawing on the fact that formal contrasts between agents and recipients are only seen with some pronouns. Specifically, all combinations of NP-agents and NP-recipients are counted as ambiguous in terms of nominal morphology, and so are combinations of NPs plus pronouns that do not differ in subject and object form, such as you, it, someone, etc. as well as combinations of these pronouns. Combinations including at least one pronominal argument that differs between subject and object form, or that is reflexive or reciprocal, e.g. yourself and each other, are classified as disambiguated by nominal morphology.
As regards verbal agreement, ambiguous here refers to instances where number and/or person marking on the verb does not aid disambiguation between agent and recipient, either because no indicative marking is in fact there, or because agent and recipient have the same number and person. For instance, both He gave me a book and He gives the student cake would be classified as ambiguous for verbal agreement, due to both arguments being potential options for subjecthood based on verb form alone. By contrast, a sentence like He gives the students cake would be coded as disambiguated by verbal agreement since the subject is identifiable from the verb form.
In order to check the reliability of this annotation, 100 instances were randomly selected and annotated by both authors. Both annotations were of course fully identical for constituent order, as it always disambiguates, but also for prepositional marking and nominal marking. There were 6 instances marked differently for verbal agreement, resulting in a Cohen’s kappa of 0.778, which indicates substantial agreement (Cohen 1960; Landis & Koch 1977: 165). As such, we considered the annotation to be reliable. In a final step, we then determined the number of strategies reflected in each instance by simply counting the responses on the strategy variables, viz. by counting how many strategies are used simultaneously in one instance.
In addition to these factors, we also added a variable called sentence length. This variable is calculated by taking the number of words of the entire sentence and subtracting the number of words of the agent, theme and recipient, and taking its natural logarithm (cf. Pijpops et al. 2018). The number of words of the three arguments are subtracted because we already know that they affect the use of nominal marking and prepositional marking. As for nominal marking, this strategy can only be used if the agent or the recipient is pronominal, that is, if it is short. As for prepositional marking, research on the dative alternation clearly shows that the variant without preposition is preferred if the theme is long and the recipient is short, and the variant with preposition is preferred with short themes and long recipients (cf. e.g. Bresnan et al. 2007; Szmrecsanyi et al. 2017; among others). A measure that is simply based on the number of words of the entire sentence, including the three arguments, hence runs the risk of circularity. By subtracting the number of words of the three arguments, we obtain an operationalisation of sentence length that admittedly less directly measures the concept at issue, but importantly does not suffer from this problem. Finally, a logarithmic transformation is used because the length of constituents appears to be processed in a logarithmic way by the human brain (Pallier et al. 2011: 2524).
For the Present Day Dutch dataset, all instances of geven ‘give’ were extracted from the Belgian subtitle component of the Sonar Corpus of Written Dutch (Oostdijk et al. 2013a), again excluding passives and instances without an explicit agent, recipient, or theme. The subtitle component was chosen because, while its material is strictly speaking still written language, it is also a close reflection of spoken language. In addition, the syntactic parses of the Alpino parser are of good quality for the subtitle component, compared to, e.g., the chat material (van Noord 2006; Oostdijk et al. 2013b). These parses were used to identify the agent, theme and recipient arguments in a first step, and to then add the values for our four disambiguation strategies. The subtitle component only contains material from Belgium, and material for which the country of origin is unknown. In order to control for country, it was therefore decided to limit the data to Belgium. A known issue of the Sonar corpus is that it contains a number of duplicate sentences (Pijpops 2019: 53). To remedy this, all sentences that were exact duplicates of other sentences in the dataset were removed. This yielded 5,810 instances of geven ‘give’.
Next, 500 instances were randomly selected to be subjected to manual checking. We set the bar at 500, because the Present-day English dataset contained 396 instances and the Middle English dataset 524 (see below). To check whether this was sufficient to obtain a representative sample, we split the English and Dutch datasets into two equal, randomly selected halves, and redid the main analyses, viz. the analyses of the degree of syntagmatic redundancy and strategy use (see Figures 1 and 2 below) on each half. This did not change the results qualitatively and had no effect on the theoretical interpretation of the results. During manual checking, the values for the four disambiguation strategies, as well as the delineation of the three arguments were corrected where necessary. In this way, the variable sentence length could be accurately calculated. A few instances still had to be excluded because they did not contain an explicit agent or recipient; a corresponding number of other sentences were then randomly selected and added to the dataset to bring the number back to 500.
The data was annotated for the four morphosyntactic strategies as follows. As for constituent order, the agent can be trusted to always precede the recipient in Dutch, except in two cases. The first case is when geven ‘give’ is part of a relative clause and the agent or recipient is a relative pronoun, as in (17). The second case is when geven ‘give’ is part of a V2-clause, and the agent or recipient takes up position in front of the finite verb, as in (18). Instances such as (17)-(18) are accordingly marked as ambiguous for constituent order, while all other instances, such as (19)-(20), are marked as disambiguated by constituent order.
- Marc Uytterhoeven
- Marc Uytterhoeven
- ‘How was the uninhibited teenager called that Marc Uytterhoeven gave a voice in Morgen Maandag?’ or ‘How was the uninhibited teenager called that gave a voice to Marc Uytterhoeven in Morgen Maandag?
- (Sonar-id: WR-P-E-G-0000004164.p.538.s.1)
- ‘Their daughter has to give Fiona the green light.’ or ‘Fiona has to give their daughter the green light.’
- (Sonar-id: WR-P-E-G-0000000530.p.9.s.1)
- ‘If you just tell everyone that they’re right, you never have to argue.’
- (Sonar-id: WR-P-E-G-0000000257.p.658.s.1)
- ‘It used to be the case that everyone decorated their fries joint in their own way.’
- (Sonar-id: WR-P-E-G-0000007723.p.354.s.1)
As for nominal marking, the same reasoning as for English applies. If the agent or the recipient is a pronoun that is marked for subject or object form, the instance is considered disambiguated by nominal marking. All other instances are considered ambiguous for nominal marking. One special case should be noted, though. Dutch distinguishes between reduced and full forms of its personal pronouns, unlike English. The reduced forms of some pronouns, viz. je and ze, do not differ between subject and object, while their corresponding full forms do, viz. jij ~ jou ‘you’, zij ~ haar ‘she ~ her’, and zij ~ hen/hun ‘they ~ them’. However, when the recipient is a pronoun and it is placed in front of the finite verb in a V2-clause, as in (21), the full form is obligatory – barring some exceptions, which do not come into play here (Haeseryn et al. 1997: 253). This is not the case for agents, which can appear in reduced form, as in (22). As a result, if the reduced form of one of these pronouns is used in a V2-clause in front of the finite verb, as in (22), the comprehender can be sure that it is the agent. Such instances are then marked as disambiguated by nominal marking, while instances like (23) are coded as ambiguous.
- ‘Moorthaemers should also give you a pat on the back.’
- ‘You should also give Moorthaemers a pat on the back.’
- ‘You should also give Moorthaemers a pat on the back.’ or ‘Moorthaemers should also give you a pat on the back.’
- (Sonar-id: WR-P-E-G-0000000694.p.360.s.1)
The annotation for verbal agreement is mostly straightforward: if the verb agrees both with the agent and the recipient, the instance is marked as ambiguous for verbal agreement. Otherwise, it is marked as disambiguated by verbal agreement. Again, however, there is one thing to note. The conjugation of the non-polite second person singular form of the verb differs depending on whether the subject is placed in front of or behind the finite verb. If it is placed behind the verb, the verb is conjugated without a -t ending (Haeseryn et al. 1997: 82). This is taken into account in that (24) is considered ambiguous for verbal agreement, since the verb agrees with both je ‘you’ and hem ‘him’. Meanwhile, (25) is considered to be disambiguated by verbal agreement. If hem ‘him’ were the agent in (25), the verb would appear as geeft ‘gives’.
- ‘You’re not giving him a single chance.’
- ‘You’re not giving him a single chance.’
- (Sonar-id: WR-P-E-G-0000001320.p.142.s.1)
Prepositional marking is also annotated completely parallel to English: if the preposition aan ‘to’ is used, as in (26), the instance is considered disambiguated by prepositional marking. Again, in order to check the reliability of the annotation, 100 instances were randomly selected and annotated by a native Dutch-speaking colleague of the first author according to the guidelines set out above. The resulting annotations were again identical for constituent order and prepositional marking, with only one instance differing for nominal marking and another verbal agreement. This yielded Cohen’s kappas of 0.973 and 0.979, respectively, which indicate almost perfect agreement (Cohen 1960; Landis & Koch 1977: 165).
- ‘How do you make it so you get both a son and a daughter?’
- (Sonar-id: WR-P-E-G-0000005091.p.727.s.1)
The following section reports the results of our comparative study of Present Day English and Dutch. All analysis and visualisation was carried out in R (R Core Team 2017), by means of the packages ‘ggplot2’ (Wickham 2016), ‘dplyr’ (Wickham and Francois 2015), ‘RColorBrewer’ (Neuwirth 2014), and ‘DescTools’ (Signorell et al. 2017).
Starting with English, the left-hand side of Figure 1 shows that unsurprisingly, zero marking is not present at all – with fixed SVO in all instances, at least one strategy, viz. constituent order, is given in all instances. Quadruple strategy use, with all four means of disambiguation employed at the same time, is similarly very infrequent (2 tokens, accounting for only 0.5% of the total). The absolute largest proportion of instances in Present Day English (almost 60%) shows double marking, with only about a quarter of all tokens (23.5%) instantiating single marking – meaning no strategies other than constituent order are used –, and 17% exhibiting by triple marking. A broader comparison of single (non-redundant) versus multiple (redundant) marking suggests that the latter is clearly the default, with a rough distribution of 25/75. Dutch presents a similar but more varied picture, as seen in the right-hand part of Figure 1. As in English, double marking is most common in Dutch, followed by single marking and triple marking. However, Dutch double marking is not as dominant as English at only 42.8%. Instead, most other degrees of redundancy are more prevalent; with zero marking reaching 6.6%, single marking 31.0%, triple marking 16.8% and quadruple marking 2.8%.
Figure 2 provides more detail on the use of each strategy for agent-recipient disambiguation: for English, constituent order disambiguates agent and recipient in 100% of the cases, as expected. Nominal marking (subject versus object case form with most pronouns) also provides a relatively reliable cue by unambiguously identifying argument role in over 60% of instances. By contrast, verbal agreement is useful in less than 1/5 of tokens (ca. 18%), and prepositional marking is lowest in use, with a mere 14% of ditransitive clauses involving a PP-recipient. This number is considerably smaller than figures seen in other investigations of the English dative alternation, where the prepositional pattern typically accounts for about a third of all ditransitives (cf. e.g. Gerwin 2014; Röthlisberger 2018). This discrepancy is presumably due to the restrictions imposed on the dataset for the present study, including only one variety and one verb, among other things, as outlined above. Still, it is safe to assume that constituent order and nominal marking qualify as the main strategies for agent-recipient disambiguation in English today.
By comparison, Dutch expectedly emerges as much less reliant on constituent order, which disambiguates in only 33% of the instances versus 100% in English (χ2 = 420.95, p < 0.0001, Cramer’s V = 0.69, OR = ∞, 95% confidence interval OR: [210.56, ∞]). Meanwhile, all other strategies are used more extensively in Dutch than in English. This is most outspokenly the case for verbal agreement, which disambiguates in 50% of the instances in Dutch versus only 18% in English (χ2 = 100.59, p < 0.0001, Cramer’s V = 0.34, OR = 0.22, 95% confidence interval OR: [0.16, 0.30]). Furthermore, the strategy of nominal marking is doing most of the heavy lifting in Dutch, as it disambiguates no less than 77% of the instances, more than any other strategy, and also more than it does in English (χ2 = 20.25, p < 0.0001, Cramer’s V = 0.15, OR = 0.52, 95% confidence interval OR: [0.38,0.70]). Finally, prepositional marking comes in last in both English and Dutch, and although it appears to be used more often in Dutch, this difference is non-significant (χ2 = 3.06, p = 0.08, Cramer’s V = 0.06, OR = 0.72, 95% confidence interval OR: [0.49, 1.06]).2 These findings confirm our first hypothesis regarding the differences between English and Dutch: Dutch relies on constituent order to a much lesser degree than English, while it draws more frequently on its strategies of nominal marking and verbal agreement.
Next, we check whether the use of a strategy becomes less likely when more of the other strategies already disambiguate. We build 7 logistic regression models, viz. one predicting the use of each of the four strategies in both languages, except for constituent order in English, as it is always in use. As a predictor, we simply counted how many of the other strategies were in use. Based on both the efficiency and the robustness accounts, we expect this predictor to have a negative effect on the use of the strategy at issue. In the case of the efficiency account, using a fourth strategy when any of the other three strategies already disambiguates would be considered inefficient, and even more so when two or even all three disambiguate. In the case of the robustness account, using a fourth strategy would still be considered useful, but also less pressing the more of the other strategies already disambiguate. The specifications of the regression models can be found in Table 1 below. Interestingly, we find that all estimates of the predictor are indeed negative in the English regression models, just like the limits of their confidence intervals, and their p-values are all below 0.05. This is not the case for the Dutch models, where the confidence intervals of the predictor’s estimates all include zero, and their p-values are all above 0.05.
|Language||Dependent variable||Estimate||Confidence interval of the estimate||P-value|
|Present-day English||Nominal marking||–0.46||–0.87||–0.05||0.0274|
|Present-day Dutch||Constituent order||0.14||–0.11||0.39||0.2890|
Finally, Figure 3 gives the results for the impact of sentence length, i.e. the natural logarithm of the number of words without counting the arguments, on redundant strategy use. As indicated by the regression line on the left-hand side of the plot, and as confirmed by a Spearman’s correlation test, there is no significant relationship between the variables in Present Day English (ρ = –0.031, p = 0.5358, 95% confidence interval for ρ: [–0.1295, 0.0676]). That is, longer sentences do not lead to higher strategy use. A significant positive correlation does, however, seem to hold for Dutch – albeit a weak one (ρ = 0.09, p = 0.043, 95% confidence interval for ρ: [0.0030, 0.1770]). As seen in the right-hand part of Figure 3, sentence length is associated with more strategies employed simultaneously in this language (even if this effect is relatively marginal).
In sum, redundant marking, and more specifically, double marking, is clearly prevalent in both English and Dutch. At the same time, the languages differ in interesting ways, both in the amount of single marking observed, as well as in the use of individual strategies. Before commenting on the implications of these findings for our hypotheses, the next section presents our results on historical English.
4 Redundancy through time
For the diachronic English part of the study, we used a dataset of active ditransitives with three explicit arguments extracted from a subpart of the Penn Helsinki Parsed Corpus of Middle English (PPCME2; Kroch et al. 2000). This corpus, which includes texts produced between 1150 to 1500 and has a size of roughly 1 million words, is subdivided into four main periods of about 70 years each (M1–M4). As with the present day data, we restrict this set to instances involving the verb give, which yields a final total of 524 tokens. Our annotation for strategy use proceeds much along the same lines as presented above for Present Day English, with some adjustments to reflect changes in the linguistic system.
First, we again only took instances featuring SVO order to be disambiguated by constituent order. We choose this way of annotation despite acknowledging that this order was less fixed at this stage, especially in the earlier Middle English texts. We do so for the following reasons: previous research has shown that any tendencies that can be observed in Old English, e.g. differences between main and subordinate clauses in terms of verb placement, are much weaker in Middle English (e.g. Kroch & Taylor 2000; Los 2009; Taylor & Pintzuk 2012; among many others). At the same time, by late Middle English, an overwhelming majority of instances seems to feature SVO (also Zehentner 2019: 174–176 on constituent order in ditransitives). We hence argue that elements found in pre-verbal slots would likely have been associated with subjects (agents), while elements in post-verbal slots could relatively reliably be identified as objects. Accordingly, both of the sentences in (27) and (28) are counted as non-disambiguated by order in our analysis, while (29) is disambiguated as it shows SVO.
- to þe
- to the
- ‘King Arthur gave great gifts to the messengers.’
- his mercy.
- his mercy
- ‘And therefore god gives him plainly his mercy.’
- ‘This master gives commandments to the child.’
Example (29) – but also (27) – then furthermore illustrates disambiguation by prepositional marking of the recipient ‘the child’ and ‘the messengers’, respectively (again determined simply by checking for the presence or absence of the preposition to). The sentence in (28), by contrast, features a double object construction, meaning the recipient referent cannot be identified through prepositional marking. As for nominal marking on the arguments, we again take combinations of at least one pronoun other than you or it with another NP or pronominal argument to be disambiguated. A case in point is example (28), where agent and recipient can be distinguished by means of the pronoun form hym ‘him’. Note that even though Middle English still featured more distinctions between case forms of pronouns (rather than a subject vs object form distinction), these are not entirely relevant to the present question, and had furthermore also begun to decrease in reliability at this stage. One additional criterion and difference in annotation to the Present Day English data that we implement here is remnants of case marking on nouns. Although the use of nominal case inflection for disambiguation had already declined considerably by early Middle English, -e was sometimes used as a dative (and thus recipient) singular marker in this period, before also being lost (Allen 1995: 158–213). Drawing on evidence from the Middle English dictionary (MED Online), we then code all recipient arguments with a final -e that is not part of their base form as disambiguated by nominal marking. For example, the dictionary form of the recipient argument in (29), viz. the childe, is child (MED Online, s.v. child, n.), suggesting that the form bearing a final vowel may be a dative, and the instance is thus disambiguated by means of inflection. By contrast, with nouns such as ME dame ‘lady’, the vowel is present in the base form. Combinations including such a noun, as well as combinations including all non-case marked or bare arguments are classified as ambiguous.3
Last, for our coding of verbal agreement, we again assume that all instances where agent and recipient have the same number and person, as e.g. in (28) and (29), are ambiguous in terms of agreement. With instances where there is a mismatch in number/person between the arguments, we consult the MED to check whether the roles may be disambiguated by verb inflection. In (27), for example, we find that the past tense form ME ʒaf ‘gave’ is used for both singular and plural (as well as all persons; cf. MED Online, s.v. yeven), and can therefore not be used as a cue for role identification. Again, 100 instances were randomly selected and annotated by both authors. No instances were differently marked for prepositional marking, and only 5 instances for constituent order, 1 for nominal marking, and 3 for verbal agreement, resulting in Cohen’s kappas of 0.841, 0.964 and 0.826, all indicating near perfect agreement (Cohen 1960; Landis & Koch 1977: 165).
The distribution of number of strategies used simultaneously in one utterance in Middle English is as shown in Figure 4 (left): double marking is most frequent at 54%, compared to triple marking in about 23% of cases. Single marking similarly accounts for about a fifth of cases (approximately 20.6%), and zero as well as quadruple marking, although attested, are rare (less than 1%, and approx. 1.5%, respectively). Combined, this means that redundant marking is greatly dominant over non-redundant marking – roughly 8 out of 10 instances instantiate at least two strategies. If we compare these figures to Present Day English as reported on above, we find that redundant marking was slightly more prevalent in Middle English overall than today. While double marking is the most common case in both datasets, single marking has nevertheless (marginally) increased in use between Middle English and Present Day English, and triple marking has somewhat decreased. The right pane of Figure 4 then shows the use of the individual strategies within Middle English, parallel to the results presented in Section (3.2). Looking at nominal marking, we find that it disambiguates in over 80% of ditransitive clauses in Middle English; this is significantly higher than in Present Day English (χ2 = 40.40, df = 1, p < 0.0001, Cramer’s V = 0.21, OR = 0.38, confidence interval OR: [0.28,0.52]).
However, this general comparison importantly masks a number of striking period-internal changes. A closer look at the distribution in the subperiods of the corpus (left-hand side of Figure 5) indicates considerable change over time: in the earliest texts (M1, ca. 1150–1250), about 50% of instances show single marking, while double marking is given in about 40% of cases. Triple marking follows at a quite large distance with less than 10%, whereas zero and quadruple marking again only occur very rarely. In late Middle English (M4, ca. 1420–1500), by contrast, double marking is highly predominant at almost 75%, and triple marking accounts for about 15%. This means that single marking has dropped substantially within Middle English, only to rise again towards Present Day English (less than 10% in 14th century English to over 20% today). This comparison indicates a development from a strong prevalence of single marking to extensive redundancy, only for redundancy to be reduced again to some degree.
Turning to the right pane of Figure 5, we find that the use of nominal marking for disambiguation remains stably high throughout the periods of Middle English. This is expected, as dative marking on the nouns has already largely disappeared by Middle English, as discussed above. Meanwhile, constituent order exhibits an outspoken and consistent rise throughout the four periods, mirrored by an equally consistent drop of verbal agreement. During this period of transition, prepositional marking seems to step up to guarantee redundancy, only to step down again once constituent order is firmly established as a reliable and dominant strategy for agent-recipient disambiguation.
Next, we again check whether the use of a strategy becomes less likely when more of the other strategies already disambiguate. The specifications of the regression models, analogous to the ones composed in Section (3.2), can be found in Table 2. We find significant negative effects for all strategies, except constituent order.
|Language||Dependent variable||Estimate||Confidence interval of the estimate||P-value|
|Middle English||Constituent order||0.07||–0.28||0.43||0.6770|
Last, we assess the impact of sentence length on redundant strategy use. As can be seen in Figure 6, this correlation is non-significant, just as in Present Day English (ρ = –0.0363, p = 0.4073, 95% confidence interval for ρ: [–0.0.1216, 0.0495]). Importantly, though, this is again subject to change over time within the period: in the earliest sub-period (M1; 1150–1250), the relation is significant, and positive, with the likelihood of more strategies being used increasing the longer a sentence is (ρ = 0.2923, p = 0.0013, 95% confidence interval for ρ: [0.1178, 0.4493]). This effect only loses in significance in the later subperiods. In the following section, we synthesise the results of our study and interpret them in light of the two opposing hypotheses presented in Section (2).
The main results of the analyses presented in the previous sections are (a) that redundancy overall appears to be the default in the languages and stages investigated here, and (b) that double marking is highly common throughout. As for the comparative perspective, we have found that the degree of redundancy is fairly consistent throughout English language use, with the large majority of sentences exhibiting double strategy use, comparatively few single and triple strategy use, and hardly any quadruple use. Meanwhile, the English strategies are structured in such a way that they complement one another in language use: when one is absent, the others are more likely to be present. Finally, the degree of redundancy does not correlate with sentence length in English. These results are contrasted by Dutch. First, Dutch language use exhibits more varied degrees of redundancy, with double use still being the majority, but the other degrees of redundancy clearly being more prevalent than in English. Second, the absence of a strategy in Dutch is not significantly correlated with the presence of the others, while that is the case in English. Third, sentence length did correlate significantly with the number of strategies used in Dutch, if only barely, but not in English.
These findings could be tentatively interpreted as follows. In English, syntagmatic redundancy is primarily motivated through the ‘systemic’ route. That is, English grammar is shaped in such a way that the four disambiguation strategies complement one another, as to ensure a consistent and reasonable degree of redundancy throughout language use. As such, the system does not allow individual language users much leeway to tune this degree of redundancy. Conversely, syntagmatic redundancy in Dutch is primarily induced through the ‘ad-hoc’ route. As Dutch language users do have more freedom to adapt the degree of redundancy in their utterances to any ad-hoc considerations such as, arguably, the length of the sentence, the degree of redundancy in Dutch language use is more varied, and the use of one strategy does not straightforwardly correlate with the absence of others.
Still, caution is in order when pursuing this interpretation. First, in the case of the non-correlations of the absence and presence of strategies in Dutch, and the non-correlation of sentence length and the degree or redundancy in English, absence of evidence is of course no evidence of absence. Second, the use of a strategy is likely to be co-determined by a number of other factors. This is evidently the case for prepositional marking, as can be gleaned from even a cursory reading of the many papers on the dative alternation in both English and Dutch (e.g. Bresnan et al. 2007; Colleman 2009; Geleyn 2017; Röthlisberger et al. 2017). Tracking down all possible co-determinants for each strategy in Dutch and English would likely require in-depth studies for each of the strategies, which is outside the scope of the present paper. Still, we highly recommend this to be done in future research. Third, the correlation between sentence length and the degree of redundancy is quite weak in Dutch and only barely significant, so it would be imprudent to conclude too much from it.
Turning to the diachronic picture, we find a consistent rise in the use of constituent order for disambiguation and a concomitant decline of verbal agreement, while nominal marking remains fairly stable. During this transition, prepositional marking steps in to drive up redundancy, only to step down again in the last period. This results in a sudden upsurge in redundancy from the first to the second periods, with a high amount of triple and even quadruple marking, followed by a stabilisation in the final periods, with double marking becoming dominant. The picture that we find here seems more in line with the robustness account, since rather than a sharp trade-off between strategies, we see that the degree of redundancy actually surges during a transitional stage, albeit that it then returns to moderate levels.
Still, the findings suggest that redundant marking in earliest Middle English was less frequent, and may have been driven by ad-hoc considerations such as processing needs in a similar way to Dutch. By the later stages, however, Middle English sees double marking become more consistent throughout its language use, much like Present Day English. We propose that this indicates a shift from syntagmatic redundancy being primarily formed through the ‘ad-hoc’ route to its being formed through the ‘systemic’ route. In sum, the history of English, like the comparative perspective, thus nicely illustrates both the commonness and maintenance of systemic redundancy, and how the degree and triggers of syntagmatic redundancy in a system may vary over time, and across languages.
What do the results in general mean for our key hypotheses then? Recall that we set out to test whether the data support (a) a strict efficiency account, on which syntagmatic redundancy should be rare, or (b) a robustness account, on which syntagmatic redundancy should be more frequent than non-redundancy. On the latter view, we furthermore anticipated high degrees of redundancy in complex environments, as it aids robust transmission and learnability. We find that our study yields mixed results on these questions, though leaning more towards the robustness account. Redundant marking is confirmed to be widespread in both English and Dutch, which supports the robustness view, in line with suggestions in e.g. Levshina (2020; 2021) or Hengeveld & Leufkens (2018). Furthermore, the developments in Middle-English did not show a strict trade-off, but instead, redundancy was actually increased as verbal agreement declines and constituent marking took over. This surge in redundancy was largely due to prepositional marking, the only truly optional disambiguation strategy in English. This may be an indication that redundancy is indeed advantageous to the language users. Still, except for this transition period, using more than two strategies simultaneously generally seems to be dispreferred, which suggests that there is a limit to redundant marking, and efficiency plays a role in containing the extent of redundancy. That is, while the efficiency account as posited above may be largely refuted based on our data, it should not be disregarded entirely either. In other words, we argue that redundant marking itself reflects a trade-off between effort, efficiency, robustness, and learnability.
The somewhat inconclusiveness of our results may have various explanations. First, we have here presumed a rather strict distinction between ‘robustness’ and ‘efficiency’, plotting predictions based on these definitions against each other. Specifically, this paper has taken efficiency to mean no syntagmatic redundancy at all. However, it could also be argued that employing redundant marking only where advantageous (viz., where it increases processing ease), but not across the board, as seen in Dutch, is in fact most efficient. This is reminiscent of previous research into argument marking (e.g. Pijpops et al. 2018; Tal et al. 2022) and disambiguation strategies in general, see e.g. Fedzechkina et al. (2017: 419), who state that “[n]atural languages tend to use case marking efficiently—that is, they typically condition case marking on semantic properties of the referent, such as animacy, and employ overt case marking when these semantic properties are more likely to bias the listener away from the intended interpretation” (cf. Fedzechkina 2012; 2016; Gibson et al. 2013a; Kurumada & Jaeger 2015; Fedzechkina & Jaeger 2020; Levshina 2020; 2021). Similarly, note that the definition of efficiency applied in this paper is largely based on efficiency as economy/reduced effort on part of the speaker. However, redundant marking as evidenced in our data may also be viewed as more efficient by balancing effort for both producers and comprehenders (cf. Levshina 2020: 73). Employing different definitions of efficiency (and potentially also robustness) may evidently affect the set-up of our study, or the conclusions drawn from the current data. A further important issue that merits further research is the question of exactly when which strategies are optional for the language producer or determined by the language system. Nevertheless, we argue that the present comparison is already insightful, in that it compares languages and language stages with differing degrees of such optionality to each other, allowing us to draw some conclusions on the degree of redundancy overall.
This study has set out to investigate redundant morphosyntactic marking in light of the competing motivations of efficiency versus robustness. Specifically, we have tested two opposing hypotheses: redundant marking should either be rare as it decreases efficiency, or should be common as it increases the robustness of message transmission in noisy environments and learnability. In our test case, we have focused on redundancy in strategies for disambiguation between agents and recipients in ditransitive clauses across space (Present Day English versus Dutch) and time (Present Day English vs Middle English). We have found that redundancy is overall pervasive. While this favours the robustness account, we have also observed that redundancy operates within limits: double marking is highly frequent, but triple or even quadruple marking is comparatively rare. This suggests a trade-off between efficiency and robustness for redundant marking. We have furthermore seen that Present Day English differs from both Dutch and earlier English in significant ways, both concerning the use of the individual strategies and the distribution of non-redundant vs redundant marking. Finally, we have also addressed the question if redundant marking is preferred in complex contexts (in our case operationalised as longer sentences). Here again, our results differ for the languages investigated; greater sentence length seems to increase the likelihood of redundant marking in Dutch, but not in English, where no such effect is found. Future research could investigate how to predict the presence or absence of a Dutch disambiguation strategy when their use is optional, as mentioned above. In addition, it may be useful to attempt to quantify the disambiguation power of lexical semantics or contextual factors.
By investigating redundancy and its potential benefits from both a typological and diachronic viewpoint, our study follows on recent explorations into the prevalence and functions of redundant marking such as Tal & Arnon (2022). Moreover, it highlights the need for further careful investigations into this issue based on naturally-occurring corpus data.
- Our definition of syntagmatic redundancy – while in essence based on the definition by Leufkens (2020), following Trudgill (2011: 22) – is considerably broader. We consider any instance of multiple morphosyntactic strategy use as redundancy, independently of the precise nature of these strategies, while Leufkens’s view is more narrowly focussed on only strategies which involve added material (2020: 81-82). For example, the sentence in (i) is tagged as redundant, because plural is marked by a suffix both on the noun taalwetenschapper ‘linguist’ and on the verb voeren ‘to lead’.
- De drie taalwetenschapper-s voer-den gisteren een diep gesprek.
- def three linguist-pl carry-pst.3pl yesterday indef deep conversation
- ‘The three linguists had a deep conversation yesterday.’
- (taken from Leufkens 2020: 82)
- It is in principle possible that these results are merely due to different configurations of the person and number of the arguments in the English versus the Dutch data. That is, perhaps sentences where the agent is, for instance, a first person singular and the recipient is a third person plural are more frequent in the Dutch data, while sentences where both the agent and the recipient are third person singular occur more often in the English data. This could cause the differences in use of verbal agreement and nominal marking that we see in Figure 2. To check for this, we created a list with all unique person and number configurations in the data, such as 1sg-3pl, 3sg-3sg, etc. Next, for each configuration, we took all instances from the language that had the lowest number of occurrences for that configuration, and a random sample of the same size from the language that had the highest number of occurrences of the configuration. In that way, we created a reduced dataset that had the exact same distribution of configurations for English and Dutch. The analyses were then rerun on this dataset, and we found that the results did not differ qualitatively. [^]
- This coding strategy is not fool-proof by any means, and likely overestimates the disambiguation power of final -e (cf. also Allen 1995: 213). Still, we consider it an acceptable approximation for our purposes. [^]
Data availability/Supplementary files
All data and scripts are publicly available on OSF via https://doi.org/10.17605/OSF.IO/9AWF7.
acc = accusative, dat = dative, me = Middle English, nom = nominative, pl = plural.
We cordially thank Melanie Röthlisberger for generously allowing us to make use of a dataset she composed (Röthlisberger 2018). We are also grateful to Anthe Sevenants for acting as a second annotator for part of the Dutch dataset.
The authors have no competing interests to declare.
Dirk Pijpops collected and annotated the Present Day Dutch data, while the Present Day English and Middle English data was mainly dealt with by Eva Zehentner. Dirk Pijpops carried out the statistical analyses and visualised the results. Both authors conceptualised and wrote the paper. Overall, both authors contributed equally to the paper, and the order of authors was determined alphabetically.
Allen, Cynthia. 1995. Case marking and reanalysis: Grammatical relations from Old to Early Modern English. Oxford: OUP.
Bahrick, Lorraine & Lickliter, Robert & Flom, Ross. 2004. Intersensory redundancy guides the development of selective attention, perception, and cognition in infancy. Current Directions in Psychological Science 13, 99–102. DOI: http://doi.org/10.1111/j.0963-7214.2004.00283.x
Baugh, Albert & Cable, Thomas. 1993. A history of the English language. London: Routledge. DOI: http://doi.org/10.4324/9780203994634
Beckner, Clay & Blythe, Richard & Bybee, Joan & Christiansen, Morten & Croft, William & Ellis, Nick & Holland, John & Ke, Jinyun & Larsen-Freeman, Diane & Schoenemann, Tom. 2009. Language is a Complex Adaptive System: Position paper. Language Learning 59(1). 1–26. DOI: http://doi.org/10.1111/j.1467-9922.2009.00533.x
Bloem, Jelke & Versloot, Arjen & Weerman, Fred. 2017. Verbal cluster order and processing complexity. Language Sciences 60. 94–119. DOI: http://doi.org/10.1016/j.langsci.2016.10.009
Bornkessel-Schlesewsky, Ina & Schlesewsky, Matthias. 2009. The role of prominence information in the real-time comprehension of transitive constructions: A cross-linguistic approach. Language and Linguistcs Compass 3. 19–58. DOI: http://doi.org/10.1111/j.1749-818X.2008.00099.x
Bouma, Gerlof & Hendriks, Petra. 2012. Partial word order freezing in Dutch. Journal of Logic Language and Information 21. 53–73. DOI: http://doi.org/10.1007/s10849-011-9145-x
Bresnan, Joan & Cueni, Anna & Nikitina, Tatiana & Baayen, Harald. 2007. Predicting the dative alternation. In Gerlof Bouma, Irene Kraemer & Joost Zwarts (eds.), Cognitive foundations of interpretation, 69–94. Amsterdam: Royal Netherlands Academy of Science. https://web.stanford.edu/~bresnan/qs-submit.pdf.
Bresnan, Joan & Hay, Jennifer. 2008. Gradient grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua 118(2). 245–259. DOI: http://doi.org/10.1016/j.lingua.2007.02.007
Broekhuis, Hans & Corver, Norbert & Vos, Riet. 2013. Syntax of Dutch. Verbs and verb phrases. Volume 1. Amsterdam: AUP. DOI: http://doi.org/10.1515/9789048517558
Bybee, Joan. 2010. Language, usage and cognition. Cambridge: CUP. DOI: http://doi.org/10.1017/CBO9780511750526
Caballero, Gabriela & Kapatsinski, Vsevolod. 2015. Perceptual functionality of morphological redundancy in Choguita Rarámuri (Tarahumara). Language, cognition and neuroscience 30(9). 1134–1143. DOI: http://doi.org/10.1080/23273798.2014.940983
Christiansen, Morten & Chater, Nick. 2008. Language as shaped by the brain. Behavioral and Brain Sciences 31. 489–509. DOI: http://doi.org/10.1017/S0140525X08004998
Cohen, Jacob. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20(1). 37–46. DOI: http://doi.org/10.1177/001316446002000104
Colleman, Timothy. 2009. Verb disposition in argument structure alternations: A corpus study of the dative alternation in Dutch. Language Sciences 31(5). 593–611. DOI: http://doi.org/10.1016/j.langsci.2008.01.001
Culbertson, Jennifer & Kirby, Simon. 2016. Simplicity and specificity in language: Domain-general biases have domain-specific effects. Frontiers in Psychology 6. DOI: http://doi.org/10.3389/fpsyg.2015.01964
Czypionka, A. & Spalek, K. & Wartenburger, I. & Krifka, M. 2017. On the interplay of object animacy and verb type during sentence comprehension in German. Linguistics 66(5). 1383–1433. DOI: http://doi.org/10.1515/ling-2017-0031
Dahl, Östen. 2004. The growth and maintenance of linguistic complexity. Amsterdam: Benjamins. DOI: http://doi.org/10.1075/slcs.71
Davies, Mark. 2013. Corpus of Global Web-Based English. https://www.english-corpora.org/glowbe/.
De Cuypere, Ludovic. 2015a. A multivariate analysis of the Old English ACC+DAT double object alternation. Corpus Linguistics and Linguistic Theory 11(2). 225–254. DOI: http://doi.org/10.1515/cllt-2014-0011
De Cuypere, Ludovic. 2015b. The Old English to-dative construction. English Language and Linguistics 19(1). 1–26. DOI: http://doi.org/10.1017/S1360674314000276
De Smet, Isabeau. 2021. De sterke werkwoorden in het Nederlands. Een diachroon, kwantitatief onderzoek. Leuven: University of Leuven.
Diessel, Holger. 2019. The grammar network. How linguistic structure is shaped by language use. Cambridge: CUP. DOI: http://doi.org/10.1017/9781108671040
Edelman, Gerald & Gally, Joseph. 2001. Degeneracy and complexity in biological systems. PNAS 98(24). 13763–13768. DOI: http://doi.org/10.1073/pnas.231499798
Fedzechkina, Mariya & Jaeger, T. Florian. 2020. Production efficiency can cause grammatical change: Learners deviate from the input to better balance efficiency against robust message transmission. Cognition 196. 104115. DOI: http://doi.org/10.1016/j.cognition.2019.104115
Fedzechkina, Mariya & Jaeger, T. Florian & Newport, Elissa L. 2012. Language learners restructure their input to facilitate efficient communication. Proceedings of the National Academy of Sciences 109(44). 17897–17902. DOI: http://doi.org/10.1073/pnas.1215776109
Fedzechkina, Maryia & Newport, Elissa L. & Jaeger, T. Florian. 2017. Balancing effort and information transmission during language acquisition: Evidence from word order and case marking. Cognitive Science 41(2). 416–446. DOI: http://doi.org/10.1111/cogs.12346
Flack, Kathryn. 2007. Ambiguity avoidance as contrast preservation: Case and word order freezing in Japanese. Occasional Papers in Linguistics 32. 57–89.
Frank, Austin & Jaeger, T. Florian. 2008. Speaking rationally: Uniform information density as an optimal strategy for language production. Proceedings of the Annual Meeting of the Cognitive Science Society 30(30). 939–944.
Geleyn, Tim. 2017. Syntactic variation and diachrony. The case of the Dutch dative alternation. Corpus Linguistics and Linguistic Theory 13(1). 65–96. DOI: http://doi.org/10.1515/cllt-2015-0062
Gerwin, Johanna. 2014. Ditransitives in British English dialects. Berlin: De Gruyter Mouton. DOI: http://doi.org/10.1515/9783110352320
Gibson, Edward & Futrell, Richard & Piantadosi, Steven P. & Dautriche, Isabelle & Mahowald, Kyle & Bergen, Leon & Levy, Roger. 2019. How efficiency shapes human language. Trends in Cognitive Sciences 23(5). 389–407. DOI: http://doi.org/10.1016/j.tics.2019.02.003
Gibson, Edward & Leon Bergen & Steven T. Piantadosi. 2013a. Rational integration of noisy evidence and prior semantic expectations in sentence interpretation. Proceedings of the National Academy of Sciences 110(20). 8051–8056. DOI: http://doi.org/10.1073/pnas.1216438110
Gibson, Edward & Piantadosi, Steven T. & Brink, Kimberly & Bergen, Leon & Lim, Eunice & Saxe, Rebecca. 2013b. A noisy-channel account of crosslinguistic word-order variation. Psychological Science 24(7). 1079–1088. DOI: http://doi.org/10.1177/0956797612463705
Greenbaum, Sidney & Nelson, Gerald. 1996. The International Corpus of English (ICE) Project. World Englishes 15(1). 3–15. DOI: http://doi.org/10.1111/j.1467-971X.1996.tb00088.x
Haeseryn, Walter & Romijn, Kirsten & Geerts, Guido & de Rooij, Jaap & van den Toorn, Maarten. 1997. Algemene Nederlandse Spraakkunst. Groningen: Nijhoff.
Hahn, Michael & Degen, Judith & Futrell, Richard. 2021. Modeling word and morpheme order in natural language as an efficient trade-off of memory and surprisal. Psychological Review 128(4). 726–756. DOI: http://doi.org/10.1037/rev0000269
Hahn, Michael & Jurafsky, Dan & Futrell, Richard. 2020. Universals of word order reflect optimization of grammars for efficient communication. Proceedings of the National Academy of Sciences 117(5). 2347–2353. DOI: http://doi.org/10.1073/pnas.1910923117
Haspelmath, Martin. 2015. Ditransitive constructions. Annual Review of Linguistics 1. 19–41. DOI: http://doi.org/10.1146/annurev-linguist-030514-125204
Hawkins, John. 2012. The drift of English towards invariable word order from a typological and Germanic perspective. In Terttu Nevalainen & Elizabeth Closs Traugott (eds.), The Oxford handbook of the history of English, 622–632. Oxford: OUP. DOI: http://doi.org/10.1093/oxfordhb/9780199922765.013.0053
Hengeveld, Kees & Leufkens, Sterre. 2018. Transparent and non-transparent languages. Folia Linguistica 52(1). 139–175. DOI: http://doi.org/10.1515/flin-2018-0003
Kempe, Vera & Brooks, Patricia. 2001. The role of diminutives in the acquisition of Russian gender: Can elements of child-directed speech aid in learning morphology? Language Learning 51. 221–256. DOI: http://doi.org/10.1111/1467-9922.00154
Kittilä, Seppo. 2006. The anomaly of the verb ‘give’ explained by its high (formal and semantic) transitivity. Linguistics 44(3). 569–612. DOI: http://doi.org/10.1515/LING.2006.019
Koplenig, Alexander & Meyer, Peter & Wolfer, Sascha & Müller-Spitzer, Carolin. 2017. The statistical trade-off between word order and word structure: Large-scale evidence for the principle of least effort. PLOS ONE 12(3). e0173614. DOI: http://doi.org/10.1371/journal.pone.0173614
Kroch, Anthony & Taylor, Ann. 2000. Verb-object order in Early Middle English. In Susan Pintzuk, George Tsoulas & Anthony Warner (eds.), Diachronic syntax: Models and mechanisms, 132–187. Oxford: OUP.
Kroch, Anthony & Taylor, Ann & Santorini, Beatrice. 2000. The Penn-Helsinki Parsed Corpus of Middle English (PPCME2). Department of Linguistics, University of Pennsylvania. CD-ROM, second edition, release 4. http://www.ling.upenn.edu/ppche/ppche-release-2016/PPCME2-RELEASE-4.
Kurumada, Chigusa & Jaeger, T. Florian. 2015. Communicative efficiency in language production: Optional case-marking in Japanese. Journal of Memory and Language 83. 152–178. DOI: http://doi.org/10.1016/j.jml.2015.03.003
Lamers, Monique & de Hoop, Helen. 2005. Animacy information in human sentence processing. In Henning Christiansen, Peter Skadhauge & Jørgen Villadsen (eds.), Constraint solving and language processing, 158–171. Berlin: Springer. DOI: http://doi.org/10.1007/11424574_10
Lamers, Monique & de Swart, Peter (eds.). 2012. Case, word order and prominence. Dordrecht: Springer. DOI: http://doi.org/10.1007/978-94-007-1463-2
Landis, John Richard & Koch, Gary Grove. 1977. The measurement of observer agreement for categorical data. Biometrics 33(1). 159–174. DOI: http://doi.org/10.2307/2529310
Leufkens, Sterre. 2015. Transparency in language: A typological study. Utrecht: LOT.
Leufkens, Sterre. 2020. A functionalist typology of redundancy. Revista da Abralin 19(3). 79–103. DOI: http://doi.org/10.25189/rabralin.v19i3.1722
Levshina, Natalia. 2018. Anybody (at) home? Communicative efficiency knocking on the Construction Grammar door. Yearbook of the German Cognitive Linguistics Association 6(1). 71–90. DOI: http://doi.org/10.1515/gcla-2018-0004
Levshina, Natalia. 2020. Efficient trade-offs as explanations in functional linguistics: some problems and an alternative proposal. Revista da Abralin 19(3). 50–78. DOI: http://doi.org/10.25189/rabralin.v19i3.1728
Levshina, Natalia. 2021. Cross-linguistic trade-offs and causal relationships between cues to grammatical subject and object, and the problem of efficiency-related explanations. Frontiers in Psychology 12. 2791. DOI: http://doi.org/10.3389/fpsyg.2021.648200
Los, Bettelou. 2009. The consequences of the loss of verb-second in English: Information structure and syntax in interaction. English Language and Linguistics 13(1). 97–125. DOI: http://doi.org/10.1017/S1360674308002876
Lupyan, Gary & Dale, Rick. 2010. Language structure is partly determined by social structure. PLoS One 5(1). e8559. DOI: http://doi.org/10.1371/journal.pone.0008559
Mahowald, Kyle. 2011. An LFG approach to word order freezing. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG11 Conference, 381–398. CSLI Publications. http://csli-publications.stanford.edu/.
Mahowald, Kyle & Diachek, Evgeniia & Gibson, Edward & Fedorenko, Evelina & Futrell, Richard. 2022. Grammatical cues are largely, but not completely, redundant with word meanings in natural language (preprint). PsyArXiv.
Malchukov, Andrej & Haspelmath, Martin & Comrie, Bernard. 2010. Ditransitive constructions: A typological overview. In Andrej Malchukov, Martin Haspelmath & Bernard Comrie (eds.), Studies in ditransitive constructions, 1–64. Berlin: De Gruyter Mouton. DOI: http://doi.org/10.1515/9783110220377.1
Maling, Joan & Zaenen, Annie (eds.). 1990. Modern Icelandic syntax. Leiden: Brill. DOI: http://doi.org/10.1163/9789004373235
McFadden, Thomas. 2002. The rise of the to-dative in Middle English. In David Lightfoot (ed.), Syntactic effects of morphological change, 107–123. Oxford: OUP. DOI: http://doi.org/10.1093/acprof:oso/9780199250691.003.0006
McWhorter, John. 2001. The world’s simplest grammars are creole grammars. Linguistic Typology 5(2/3). 125–166. DOI: http://doi.org/10.1515/lity.2001.001
MED Online = University of Michigan Regents. 2013. The electronic Middle English dictionary. http://quod.lib.umich.edu/m/med/.
Miestamo, Matti. 2008. Grammatical complexity in a cross-linguistic perspective. In Matti Miestamo, Kaius Sinnemäki & Fred Karlsson (eds.), Language complexity: Typology, contact, change, 23–42. Amsterdam: Benjamins. DOI: http://doi.org/10.1075/slcs.94.04mie
Monaghan, Paidric. 2017. Canalization of language structure from environmental constraints: A computational model of word learning from multiple cues. Topics in Cognitive Science 9(1). 21–34. DOI: http://doi.org/10.1111/tops.12239
Montgomery, James. 2000. Verbal working memory and sentence comprehension in children with specific language impairment. Journal of Speech, Language, and Hearing Research 43(2). 293–308. DOI: http://doi.org/10.1044/jslhr.4302.293
Naess, Ashild. 2007. Prototypical transitivity. Amsterdam: Benjamins. DOI: http://doi.org/10.1075/tsl.72
Neuwirth, Ernst. 2014. RColorBrewer: ColorBrewer palettes. https://cran.r-project.org/web/packages/RColorBrewer/index.html.
Nevalainen, Terttu & Traugott, Elizabeth (eds.). 2012. The Oxford handbook of the history of English. Oxford: OUP. DOI: http://doi.org/10.1093/oxfordhb/9780199922765.001.0001
Newman, John. 1998. Recipients and ‘give’ constructions. In Willy van Langendonck & William Van Belle (eds.), The dative, Vol. 2: Theoretical and contrastive studies, 1–28. Amsterdam: Benjamins. DOI: http://doi.org/10.1075/cagral.3.03new
Nichols, Johanna. 2009. Linguistic complexity: a comprehensive definition and survey. In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 110–125 Oxford: OUP.
Oostdijk, Nelleke & Reynaert, Martin & Hoste, Véronique & Schuurman, Ineke. 2013a. The construction of a 500-million-word reference corpus of contemporary written Dutch. In Peter Spyns & Jan Odijk (eds.), Essential speech and language technology for Dutch: Theory and applications of Natural Language Processing, 219–247. Heidelberg: Springer. DOI: http://doi.org/10.1007/978-3-642-30910-6_13
Oostdijk, Nelleke & Reynaert, Martin & Hoste, Véronique & Schuurman, Ineke. 2013b. SoNaR User Documentation.
Ortega, Lourdes. 2012. Interlanguage complexity: A construct in search of theoretical renewal. In Bernd Kortmann & Benedikt Szmrecsanyi (eds.), Linguistic complexity: Second language acquisition, indigenization, contact, 127–155. Berlin: De Gruyter. DOI: http://doi.org/10.1515/9783110229226.127
Pallier, Christophe & Devauchelle, Anne-Dominique & Dehaene, Stanislas. 2011. Cortical representation of the constituent structure of sentences. Proceedings of the National Academy of Sciences of the United States of America 108(6), 2522–2527. DOI: http://doi.org/10.1073/pnas.1018711108
Pijpops, Dirk. 2019. How, why and where does argument structure vary? A usage-based investigation into the Dutch transitive-prepositional alternation. Dissertation University of Leuven.
Pijpops, Dirk & Speelman, Dirk & Grondelaers, Stefan & Van de Velde, Freek. 2018. Comparing explanations for the Complexity Principle. Evidence from argument realization. Language and Cognition 10(3). 514–543. DOI: http://doi.org/10.1017/langcog.2018.13
Pintzuk, Susan. 2002. Morphological case and word order in Old English. Language Sciences 24. 381–395. DOI: http://doi.org/10.1016/S0388-0001(01)00039-0
R Core Team. 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. www.R-project.org/.
Rohdenburg, Günter. 1996. Cognitive complexity and increased grammatical explicitness in English. Cognitive Linguistics 7. 149–182. DOI: http://doi.org/10.1515/cogl.19184.108.40.206
Röthlisberger, Melanie. 2018. The dative dataset of World Englishes. KU Leuven. DOI: http://doi.org/10.5281/zenodo.2553357
Röthlisberger, Melanie & Grafmiller, Jason & Szmrecsanyi, Benedikt. 2017. Cognitive indigenization effects in the English dative alternation. Cognitive Linguistics 28(4). 673–710. DOI: http://doi.org/10.1515/cog-2016-0051
Sadock, Jerrold. 2012. The modular architecture of grammar. Cambridge: CUP. DOI: http://doi.org/10.1017/CBO9780511997587
Schmid, Hans-Jörg. 2020. The dynamics of the linguistic system. Usage, conventionalization, and entrenchment. Oxford: OUP. DOI: http://doi.org/10.1093/oso/9780198814771.001.0001
Sedlak, Philip. 1975. Direct/indirect object word order: A cross-linguistic analysis. Working Papers on Language Universals. University of Southern California.
Siewierska, Anna. 1998. Variation in major constituent order; a global and a European perspective. In Anna Siewierska (ed.), Constituent Order in the Languages of Europe, 475–552. De Gruyter Mouton. DOI: http://doi.org/10.1515/9783110812206.475
Signorell, Andri and et mult. Al. 2017. DescTools: Tools for Descriptive Statistics. R package version 0.99.23. https://cran.r-project.org/package=DescT
Sinnemäki, Kaius. 2008. Complexity trade-offs in core argument marking. In Matti Miestamo, Kaius Sinnemäki & Fred Karlsson (eds.), Language complexity: Typology, contact, change, 67–88. Amsterdam: Benjamins. DOI: http://doi.org/10.1075/slcs.94.06sin
Sinnemäki, Kaius. 2009. Complexity in core argument marking and population size. In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 126–140. Oxford: OUP.
Sinnemäki, Kaius. 2010. Word order in zero-marking languages. Studies in Language 34(4). 869–912. DOI: http://doi.org/10.1075/sl.34.4.04sin
Sinnemäki, Kaius. 2014. Complexity trade-offs: A case study. In Frederick Newmeyer & Laurel Preston (eds.), Measuring grammatical complexity, 179–201. Oxford: OUP. DOI: http://doi.org/10.1093/acprof:oso/9780199685301.003.0009
Sloutsky, Vladimir & Robinson, Christopher. 2013. Redundancy matters: Flexible learning of multiple contingencies in infants. Cognition 126(2). 156–164. DOI: http://doi.org/10.1016/j.cognition.2012.09.016
Steels, Luc. 2000. Language as a Complex Adaptive System. In Marc Schoenauer, Kalyanmoy Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, Juan Julian Merelo & Hans-Paul Schwefel (eds.), Proceedings of PPSN VI: Lecture notes in Computer Science, 17–26. Berlin: Springer. DOI: http://doi.org/10.1007/3-540-45356-3_2
De Sutter, Gert. 2009. Towards a multivariate model of grammar: The case of word order variation in Dutch clause final verb clusters. In Andreas Dufter, Jürg Fleischer & Guido Seiler (eds.), Describing and modeling variation in grammar, 225–254. Berlin: De Gruyter Mouton. DOI: http://doi.org/10.1515/9783110216097.3.225
Szmrecsanyi, Benedikt. 2004. On operationalizing syntactic complexity. In Gérard Purnelle, Cédrick Fairon & Anne Dister (eds.), Proceedings of the 7th International Conference on Textual Data Statistical Analysis. Vol. 2, 1032–1039. Louvain-la-Neuve: Presses universitaires de Louvain.
Szmrecsanyi, Benedikt. 2012. Analyticity and syntheticity in the history of English. In Terttu Nevalainen & Elizabeth Traugott (eds.), The Oxford handbook of the history of English, 654–665. Oxford: OUP. DOI: http://doi.org/10.1093/oxfordhb/9780199922765.013.0056
Szmrecsanyi, Benedikt & Grafmiller, Jason & Bresnan, Joan & Rosenbach, Anette & Tagliamonte, Sali & Todd, Simon. 2017. Spoken syntax in a comparative perspective: The dative and genitive alternation in varieties of English. Glossa 2(1). 1–17. DOI: http://doi.org/10.5334/gjgl.310
Szmrecsanyi, Benedikt & Kortmann, Bernd. 2012. Introduction: Linguistic complexity: Second Language Acquisition, indigenization, contact. In Bernd Kortmann & Benedikt Szmrecsanyi (eds.), Linguistic complexity: Second language acquisition, indigenization, contact, 6–34. Berlin: De Gruyter. DOI: http://doi.org/10.1515/9783110229226.6
Tal, Shira & Arnon, Inbal. 2022. Redundancy can benefit learning: Evidence from word order and case marking. Cognition 224. 105055. DOI: http://doi.org/10.1016/j.cognition.2022.105055
Tal, Shira & Grossman, Eitan & Rohde, Hannah & Arnon, Inbal. 2021. Speakers use more redundant referents with language learners: Evidence for communicatively-efficient referential choice (preprint). PsyArXiv. DOI: http://doi.org/10.31234/osf.io/cw2be
Tal, Shira & Smith, Kenny & Culbertson, Jennifer & Grossman, Eitan & Arnon, Inbal. 2022. The impact of information structure on the emergence of differential object marking: an experimental study Cognitive Science 46(3). e13119. DOI: http://doi.org/10.1111/cogs.13119
Tanner, Darren & Bulkes, Nyssa Z. 2015. Cues, quantification, and agreement in language comprehension. Psychonomic Bulletin & Review 22(6). 1753–1763. DOI: http://doi.org/10.3758/s13423-015-0850-3
Taraban, Roman. 2004. Drawing leaners’ attention to syntactic context aids gender-like category induction. Journal of Memory and Language 51(2). 202–216. DOI: http://doi.org/10.1016/j.jml.2004.03.005
Taylor, Ann & Pintzuk, Susan. 2012. Rethinking the OV/VO alternation in Old English: The effect of complexity, grammatical weight, and information status. In Terttu Nevalainen & Elizabeth Closs Traugott (eds.), The Oxford handbook of the history of English. DOI: http://doi.org/10.1093/oxfordhb/9780199922765.013.0068
Trips, Carola. 2002. From OV to VO in Early Middle English. Amsterdam: Benjamins. DOI: http://doi.org/10.1075/la.60
Trudgill, Peter. 2011. Sociolinguistic typology: Social determinants of linguistic complexity. Oxford: OUP.
Van de Velde, Freek. 2014. Degeneracy: the maintenance of constructional networks. In Ronny Boogaart, Timothy Colleman & Gijsbert Rutten (eds.), Extending the scope of Construction Grammar, vol. 1, 141–179. Berlin: Mouton de Gruyter. DOI: http://doi.org/10.1515/9783110366273.141
Van Everbroeck, Ezra. 2003. Language type frequency and learnability from a connectionist perspective. Linguistic Typology 7(1). 1–50. DOI: http://doi.org/10.1515/lity.2003.011
van Noord, Gertjan. 2006. At last parsing is now operational. In Piet Mertens, Cédric Fairon, Anne Dister & Patrick Watrin (eds.), TALN 2006. Verbum Ex Machina. Actes de la 13e conference sur le traitement automatique des langues naturelles, 20–42. Louvain-la-Neuve: Cental. https://aclanthology.org/2006.jeptalnrecital-invite.2.
VanPatten, Bill (ed.). 2004. Processing instruction: Theory, research, and commentary. Mahwah, NJ: Erlbaum. DOI: http://doi.org/10.4324/9781410610195
van Trijp, Remi. 2013. Linguistic selection criteria for explaining language change: a case study on syncretism in German definite articles. Language Dynamics and Change 3(1). 105–132. DOI: http://doi.org/10.1163/22105832-13030106
Wickham, Hadley. 2016. ggplot2: Elegent graphics for data analysis. New York: Springer. https://cran.r-project.org/web/packages/ggplot2/index.html. DOI: http://doi.org/10.1007/978-3-319-24277-4_9
Wickham, Hadley & Francois, Romain. 2015. dplyr: A Grammar of Data Manipulation. https://cran.r-project.org/web/packages/dplyr/index.html.
Winter, Bodo. 2014. Spoken language achieves robustness and evolvability by exploiting degeneracy and neutrality: Prospects & Overviews. BioEssays 36(10). 960–967. DOI: http://doi.org/10.1002/bies.201400028
Wolfram, Walt & Thomas, Erik. 2002. The development of African American English. London: Wiley. DOI: http://doi.org/10.1002/9780470690178
Yoshida, Hanako & Smith, Linda. 2005. Linguistic cues enhance the learning of perceptual cues. Psychological Science 16(2). 90–95. DOI: http://doi.org/10.1111/j.0956-7976.2005.00787.x
Zehentner, Eva. 2019. Competition in language change: The rise of the English dative alternation. Berlin: De Gruyter Mouton. DOI: http://doi.org/10.1515/9783110633856