1. Introduction

An important question at the heart of language acquisition research is to identify what aspects of a second language (L2) are more challenging than others to acquire due to crosslinguistic influence from the dominant language, among other factors. Whether L2 learners can acquire L2 properties that are not present in their first language (L1) becomes crucial in understanding the locus of difficulty (e.g., Liceras 1985; 1996; Tsimpli & Rousseau 1991; Schwartz & Sprouse 1996; Hawkins & Chan 1997; Montrul & Slabnakova 2003; White 2003; Liceras & Zobl & Goodluck 2008; Lardiere 2008; 2009; Slabakova 2009; Putnam & Sánchez 2013). Furthermore, research has shown a role for the learning environment and quantity of input in the difficulties that L2 learners often show (e.g., Swain 1985; VanPatten 1987; Pérez-Leroux & Glass 1999; Hawkins 2008).

We add to previous work by investigating the domain of accusative clitics among Chinese/Spanish bilinguals, an area of research so far underexplored (e.g., Hu 2007; Clements 2009; Cuza & Pérez-Leroux & Sánchez 2013; Jiao 2021). In contrast with Spanish, Chinese1 does not have a clitic system of the Spanish type encoding definiteness, grammatical gender, number, and case; these categories are not encoded morphologically in Chinese.2 Accusative clitic pronouns in Spanish are challenging to master for L2 learners despite explicit instruction (e.g., Liceras 1985; VanPatten 1987; Bruhn de Garavito & Montrul 1996; Duffield & White 1999; Sánchez & Al-Kasey 1999; Montrul 2010). This is especially true in the case of L2 learners with no clitic systems in their L1, as in the case of English and Chinese. Researchers have found ungrammatical object drop or overextension errors as well as lack of gender specification on the clitic in offline tasks (e.g., Sánchez 1999; Sánchez & Al-Kasey 1999; Zyzik 2008; Arche & Dominguez 2011; Mayer & Sánchez 2017). These studies suggest a lack of sensitivity in both production and interpretation of the semantic and syntactic features constraining the acquisition of accusative clitics in Spanish.

Regarding Chinese/Spanish bilinguals, Cuza et al. (2013) found an advantage in target production, acceptability, and interpretation of object clitics among simultaneous bilinguals from Peru compared to sequential bilinguals. However, an issue underexplored so far is the question of where the advantage of early acquisition is reflected when naturalistic learners are compared to classroom learners with late and limited exposure to Spanish. In other words, do Chinese-speaking classroom learners differ from those exposed to full immersion in their production of object clitics?

Although there is plenty of research in other aspects of Spanish grammar including gender and number agreement (e.g., Dowens & Guo & Guo & Barber & Carreiras 2011), Differential Object Marking (e.g., Cuza & Jiao & López-Otero 2018), tense and aspect (e.g., Sun & Díaz & Taulé 2019), and copula verb use (e.g., Cheng & Lu & Giannakouros 2008), research on the acquisition of object clitics in Chinese/Spanish bilinguals remains underexplored with some exceptions (e.g., Hu 2007; Clements 2009; Cuza et al. 2013; Jiao 2021). Specifically, we investigate the extent to which input factors and age of onset of acquisition play a role in the acquisition of Spanish object clitics among Chinese/Spanish bilinguals. Research shows a strong correlation between age of onset of L2 acquisition and the level of native-like attainment adult L2 learners can achieve due to cognitive and socio-cultural factors related to age (e.g., Johnson & Newport 1989; Jia & Aaronson 2003).

We compare L2 learners in an instructional setting in China with simultaneous and sequential Chinese/Spanish bilinguals living in Peru via an elicited narrative task. Unlike the Chinese/Spanish bilinguals in contact with Spanish in a naturalistic setting from an early age in Lima, Peru, the L2 learners were exposed to reduced input in a classroom setting which might lead to higher levels of crosslinguistic influence. Specifically, we examine three issues in the production of Spanish accusative clitics: clitic use, gender specification, and the formation of clitic clusters.3

The paper is organized as follows. Section 2 discusses object expression in both Spanish and Chinese. Section 3 presents previous work on the acquisition of Spanish by Chinese-speaking learners, followed by the research questions and hypotheses. The study and the results are presented in Section 4, followed by the discussion and the conclusions in Section 5.

2. Spanish accusative clitics

2.1 Object expression in Spanish

In most non-contact varieties of Spanish, definite and indefinite specific objects cannot be dropped (e.g., Campos 1986; Sánchez 1999; Clements 2006; Schwenter 2006). Pronominal clitics are widely used as anaphoric elements coindexed with a referent in the discourse. Specifically, third-person accusative clitics (lo/la ‘it/him/her’, los/las ‘them’) share the corresponding gender, number, and related semantic features of their antecedents as well as their referential semantic properties like definiteness and specificity, as in (1) below:

    1. (1)
    1. Q:
    1. ¿Roberto,
    2. Roberto,
    1. compraste
    2. buy-2s.pst
    1. los
    2. art.mpl
    1. zapatos?
    2. shoe-mpl
    1. ‘Roberto, have you bought the shoes?’
    1. A:
    1. Sí,
    2. Yes
    1. los
    2. cl3mpl
    1. compré.
    2. buy-1s.pst
    1. ‘Yes, I bought them.’

Clitic use is also related to semantic properties of the antecedent, as a null object is only allowed for indefinite nonspecific antecedents (e.g., Campos 1986; Sánchez & Al-Kasey 1999; Clements 2006). This is represented in the question-answer pairs in (2):

    1. (2)
    1. Q:
    1. ¿Roberto,
    2. Roberto,
    1. compraste
    2. buy-2s.pst
    1. arroz?
    2. rice-ms
    1. ‘Roberto, have you bought rice?’
    1. A:
    1. Sí,
    2. Yes,
    1. Ø
    1. compré.
    2. buy-1s.pst
    1. ‘Yes, I bought it.’

In (2), the direct object is a bare mass noun arroz ‘rice’ that is indefinite and nonspecific. Therefore, a null object is allowed. According to Campos (1986), Spanish null objects are variables bound to a null topic. Although null objects are not regarded as a general characteristic of Spanish, they are allowed to different degrees across the monolingual varieties and the varieties in contact with other languages (e.g., Moreno-Fernández 2019). Relevant to the present study, Moreno-Fernández showed that although null objects are productive in the Andean regions where Spanish is in contact with indigenous languages, their acceptance is low in monolingual varieties, such as the Limeño dialect of Peru which is the focus of this study. Another characteristic of Andean varieties of Spanish is the simplification of case, gender, and even number features in the use of clitics. This is shown in (3a) and (3b) respectively:

    1. (3)
    1. a.
    1. Juan
    2. Juan
    1. lei
    2. cl-3s/
    1. /loi
    2. cl-3ms/
    1. /lai
    2. cl-3fs
    1. conoce
    2. knows-3s
    1. a
    2. dom
    1. mi
    2. poss
    1. mamái.
    2. mother
    1. ‘Juan knows my mother.’
    2.                                                                                                        (Pérez, 1997: 36 in Mayer, 2017:201)
    1. b.
    1. Algunas
    2. some
    1. cosas
    2. thing-fpl
    1. le
    2. cl3s
    1. entiendo
    2. understand-1s
    1. y
    2. and
    1. algunas
    2. others
    1. no.
    2. not
    1. ‘Some things I understand and others I do not.’
    2.                                                                                                                                        (Mayer 2017:155)

Regarding the variety of Spanish examined in this study, it is known that Andean Spanish in contact with indigenous languages uses the dative clitic le with animate direct objects. Some varieties also exhibit the use of lo with no gender distinction (‘loísmo’). The different distribution of the anaphoric features in this variety was brought to Lima by immigrants from other Andean Spanish-speaking areas (Rivarola 1990; Cerrón-Palomino 1995). However, the traditional norm of Limeño Spanish shows the distinction of gender in the accusative clitics, using both lo and la, with very limited instances of leísmo (4%) and loísmo (1%) (Klee & Caravedo 2015). The system of accusative clitics in Limeño Spanish thus can be seen as the standard norm that marks both case and gender, challenged by the increasing tendency towards syncretism, showing simplification for gender specification or case marking.

Finally, Spanish accusative clitics can co-occur with dative clitics or the clitic se forming clitic clusters. When a 3rd person dative clitic (le or les) clusters with an accusative clitic, it loses the case and number markings and shares the form with the clitic pronoun se (Perlmutter 1971). The clustering of an accusative and a 3rd person dative clitic and the clustering of an accusative clitic and the clitic se are shown in (4a) and (4b) respectively:

    1. (4)
    1. a.
    1. – ¿Compró
    2.     buy-3s.pst
    1. Roberto
    2. Roberto
    1. los
    2. art.mpl
    1. zapatos
    2. shoe-mpl
    1. para
    2. for
    1. Rosa?
    2. Rosa
    1. ACC + DAT
    1.     ‘Did Roberto buy the shoes for Rosa?’
    1. – Sí,
    2.    yes
    1. se
    2. dat3s
    1. los
    2. acc3mpl
    1. compró.
    2. buy-3s.pst
    1.    ‘Yes, he bought them for her.’
    1. b.
    1. – ¿Se
    2.       refl3
    1. llevó
    2. wear-3s.pst
    1. Rosa
    2. Rosa
    1. los
    2. art.mpl
    1. zapatos?
    2. shoe-mpl
    1.           ACC + REFL
    1. – Sí,
    2.     yes
    1. se
    2. dat3s
    1. los
    2. acc3mpl
    1. llevó.
    2. wear-3s.pst
    1.     ‘Yes, she wore them.’

As in the case of a sole clitic, clitic clusters can also precede a finite verb in a proclitic position (e.g., se lo dije ‘I told him/her’) or follow a non-finite verb in an enclitic position (e.g., quiero decírselo, ‘I want to tell him/her’). Importantly, there are constraints on the combination and linearization within the cluster. Some surface constraints have been proposed by Dinnsen (1972) and Perlmutter (1971) in terms of case and person, respectively. These templates, illustrated in (5) and (6), capture all the possible combinations of clitics:

    1. (5)
    1. Dinnsen’s Thematic case hierarchy for clitic clusters
    2. Reflexive > Benefactive > Dative > Accusative
    1. (6)
    1. Perlmutter’s template for Spanish clitic clusters
    2. Se      II         I         III (dat)      III (acc)4

This linear organization applies to both proclitic and enclitic positions of the clitic clusters. Mayer (2017) summarized the linearization of Spanish clitic clusters in (7) below:

    1. (7)
    1. V
    2. -fin
    1. se II I III (AUX)
    1. V
    2. +fin

According to the templates above, either the reflexive or the dative se precedes accusative clitics. The reflexive se precedes all other types of clitics, and the dative clitic always precedes the accusative one. Adapting Grimshaw’s (1982) approach to French clitics, Mayer (2017:47–48) argues that the reflexive se in the first position can bind both the external argument (the subject) and one of the internal arguments (the direct or indirect object). Second and first-person clitics can bind either an internal argument or the external argument, followed by the third-person clitic that only has the potential of binding one of the internal arguments. In cases where there are two third-person clitics in a cluster, Spanish follows a Person-Case Constraint (e.g., Bonet 1991; Adger & Harbour 2007) that prohibits the co-occurrence of both forms. In this case, the accusative clitics are put in the position for third-person clitics, which causes the dative to be changed to se.

As pointed out by Heap (2005), this linearization of clitics follows an order from the clitic with the least feature specification (i.e., se that only specifies the person feature) to the one with more complex feature specification (i.e., lo(s)/la(s) that specify person, gender, number, and case). This linearization of clitics would present difficulty for Chinese-speaking learners because of the different word orders and feature specification requirements of the pronouns, as we discuss in the following section.

2.2 Object expression in Chinese

The Chinese pronominal system is simpler than that of Spanish as it does not have a pronominal clitic system and it does not exhibit case. For third-person pronouns, there is one phonological representation (ta) for masculine (他), feminine (她), and inanimate (它) forms. Regarding the nominal system, Chinese does not grammaticalize definiteness or specificity, nor does it have grammatical gender. A plural suffix -men (们) is only manifested on pronouns or definite animate nominals (Aoun & Li 2003), as shown in (8):

    1. (8)
    1. a.
    1. ta
    2. he
    1. hui
    2. will
    1. dai
    2. take
    1. keren-men/ta-men
    2. guest-pl him/her- pl
    1. qu
    2. go
    1. canting.
    2. restaurant
    1. ‘He will take the guests/them to the restaurant.’
    1. b.
    1. ta
    2. he
    1. hui
    2. will
    1. dai
    2. take
    1. zhexie
    2. these
    1. shu(*-men/*ta-men)
    2. book (*- pl/*it- pl)
    1. qu
    2. go
    1. xuexiao.
    2. school
    1. ‘He will take these books to school.’
    1. c.
    1. you
    2. have
    1. ren(*-men)
    2. person(*- pl)
    1. laile.
    2. come-perf
    1. ‘There are some persons coming.’

As shown in the contrast between (8a) and (8b), -men is ungrammatical when the referent is inanimate for either a noun phrase (i.e., ‘books’) or a pronoun. Moreover, when the animate noun phrase (i.e., ‘guests’) is marked by -men, it is interpreted as definite. An indefinite interpretation would not be possible. This incompatibility between the suffix and an indefinite interpretation of the NP is clearer in the existential sentence in (8c). Hence, plural marking is restricted.

Chinese is classified as a topic-drop language that productively exhibits null objects regardless of the phi-features or definiteness or specificity of the antecedent (Chomsky 1981; Huang 1984):

    1. (9)
    1. Q:
    1. Ni
    2. you
    1. mai
    2. buy
    1. naxie
    2. those
    1. shuj /
    2. books/
    1. chak
    2. tea
    1. le       ma?
    2. -perf-q
    1. ‘Did you buy those books/tea?’
    1. A:
    1. ei
    1. mai    le
    2. buy-perf
    1. ej/k
    1. ‘I bought them/(some).’

In (9), the direct object is omitted regardless of whether the referent is definite (‘those books’) or indefinite (‘tea’) since the information can be recovered from the discourse. As in Spanish, the null object is closely related to the discourse topic. Huang (1984) argues that Chinese null objects are generated by movement and are bound to a null topic at the left periphery of the clause, which is related to the discourse topic. This position has been challenged by cases in which a null object violates island constraints, indicating that Chinese null objects are not generated by movement. Li (2008; 2014) proposed that Chinese null objects belong to a new empty category named TEC (‘truly empty category’). A TEC is base-generated in its position, and it only has case and categorical features but not phi-features or referential features. According to Li, Chinese verbs can license TEC because they do not obligatorily show the complete argument structure. In her account, the close relationship between a null object and the discourse topic is due to the contextual or pragmatic prominence of the latter.

In sum, the differences between Spanish and Chinese in object expression might lead to some difficulties that a Chinese-speaking learner would have to overcome in their acquisition of third-person accusative clitics in Spanish. First, Chinese is known to be an object-drop language, unlike languages like Spanish or English. In Chinese, a direct object can often be omitted when it can be recovered from the discourse (Huang 1984). However, null objects in Spanish are highly restricted and are only possible with indefinite antecedents (Campos 1986). Second, Chinese does not exhibit pronominal clitics of the Spanish kind, and gender and number markings in Chinese are not specified in the same way as in Spanish. Moreover, Chinese generally follows an SVO word order like Spanish, but disallows a clitic cluster structure as in (4), in which the object precedes the verb. Hence, Chinese-speaking learners must develop a system of Spanish pronominal clitics with all the features as a strategy of object expression and must acquire the patterns of word order in case there is more than one clitic in a string.

3. The acquisition of Spanish clitics by Chinese/Spanish bilinguals

Previous work among Spanish/English bilinguals and Spanish in contact with other languages has shown difficulties with target production and acceptability of object clitics and the target specification of gender features (e.g., Klee 1989; Sánchez & Al-Kasey 1999; Zyzik 2008; Arche & Domínguez 2011; Rossi et al. 2014; Mayer & Sánchez 2017; López-Otero & Cuza & Jiao, 2021). Specifically, whereas clitic placement or word order has shown to pose less difficulty (Liceras 1985; Duffield & White 1999; Montrul 2010; Rossi et al. 2017), gender is more difficult to acquire for learners with an L1 that does not mark gender morphologically (Rossi et al. 2014).

Regarding Spanish/Chinese bilinguals specifically, Hu (2007) examined younger and older Chinese-speaking immigrants to Ecuador through oral interviews. The author found that the Spanish spoken by the older generation (late adult learners) was characterized by a simplified inflection and pronominal system as well as a simplified verbal conjugation paradigm. The author argued that the level of integration into the host community and schooling play an important role, supporting Clements’ (2009) proposal. Clements (2009) compared the grammar of two long-term Chinese immigrants via spontaneous speech collected via interviews. The two subjects immigrated to Spain during the 1980s and learned Spanish naturalistically yet differed in their level of integration into the society. Although the participant that used Spanish in more contexts showed more morphological features and higher fluency than the other participant, both showed a lack of clitics or non-standard clitics in their pronominal system. For example, the participant with more Spanish usage exhibited a new set of non-nominative pronouns from possessive pronouns (e.g., él no sabe mío [él no me conoce] ‘He doesn’t know me’). These data led Clements to argue that the nature of these adult learners’ Spanish grammar was similar to the Spanish spoken by the Chinese-speaking population in Cuba in the 19th century (pp. 102–103). Clements concluded that the findings on these two participants were consistent with Klein and Perdue’s (1997) claim that naturalistic L2 learning follows a developmental path starting with a simplification of features.

More recently, Cuza et al. (2013) tested three groups of Chinese/Spanish bilinguals in Lima, Peru, and compared them with native speakers of Limeño Spanish. The bilingual groups included 12 simultaneous bilinguals born and raised in Peru, 13 sequential bilinguals, and 13 long-term immigrants. The participants completed an elicited production task (EPT), a sentence completion task (SCT), a truth value judgment task (TVJT), and an acceptability judgment task (AJT). The TVJT tested the knowledge of the non-referential property of null objects in Spanish whereas the other tasks tested the participants’ acceptability and production of accusative clitics in clitic left-dislocated (CLLD) structures. The results showed that the simultaneous bilinguals paralleled the monolingual speakers of Spanish in all the tasks except with overt clitics in indefinite nonspecific contexts. The sequential bilinguals showed acquisition of the clitic system but preferred using an overt DP or a null object. The adult immigrants showed more variability than the other two bilingual groups in both production and acceptability. Furthermore, the authors found more within-group variability among the sequential bilinguals and the adult immigrants. The authors argue for transfer effects from the productive null object strategy in Chinese L1 as well as age of onset of acquisition effects. This is consistent with Montrul’s (2010) study which found age effects in the acquisition of object clitic placement among English-speaking L2 learners of Spanish and heritage speakers. Montrul found that the heritage speakers outperformed the L2 learners due to early exposure to Spanish and extended input.

3.1. Research questions and hypotheses

Taking into consideration previous work, we examine the extent to which classroom L2 learners of Spanish in China have knowledge of clitic use and distribution compared to simultaneous and sequential Chinese/Spanish bilinguals from Lima, Peru. The patterns of clitic use in spontaneous production may shed some light on the bilinguals’ clitic system in relation to the extent of feature specification on the clitic (e.g., case and gender features).5 Furthermore, the production of clitic clusters may show evidence on whether bilingual speakers have a system of clitics similar to the native norm that allows for the co-appearance of both accusative and dative clitics. The research questions and hypotheses guiding the study are as follows:

  1. RQ1: To what extent do Chinese/Spanish bilinguals show target-like production of accusative clitics in Spanish?

    Hypothesis 1: Early or extensive exposure to Spanish facilitates the reliance on clitics as a form of object realization (Cuza et al. 2013). Based on this hypothesis, we predict that the participants with later and limited exposure to Spanish will show more difficulty with the target production of an accusative clitic as a strategy of object expression.

  2. RQ2: To what extent do Chinese/Spanish bilinguals show sensitivity to gender agreement features in the clitic?

    Hypothesis 2: Early exposure will facilitate gender specification in object clitics. The sequential bilinguals and L2 learners will show divergences from the target specification of gender features, unlike the simultaneous bilinguals, who will not show gender simplification.

  3. RQ3: To what extent do Chinese/Spanish bilinguals produce syntactically more complex clitic structures such as clitic clusters, and recognize the corresponding gender features of the clitic?

    Hypothesis 3: The L2 learners and the sequential bilinguals will show low production of clitic clusters compared with the simultaneous bilinguals due to structure complexity (Jackubowicz & Strik 2008; Cuza 2013).

As noted previously, because the story-retelling task may not provide a sufficient amount of tokens to contrast single and plural clitics, we did not test number agreement between the clitic and the antecedent. Another reason is that although Chinese does not have a full-fledged number-marking system, it partially overlaps with Spanish number marking, which makes it less taxing compared to gender marking. In what follows, we discuss the study and the results.

4 The study

4.1 The participants

Data was elicited from 42 Chinese/Spanish bilinguals divided into three groups: 11 simultaneous bilinguals, 10 sequential bilinguals who arrived in Peru during childhood, and 21 classroom L2 learners from China. All the participants completed a language background questionnaire and an adapted version of the DELE (Diploma de Español como Lengua Extranjera) (Duffield & White 1999; Bruhn de Garavito 2002; Montrul & Slabakova 2003; Cuza et al. 2013). Following Montrul & Slabakova (2003), scores between 40-50 points were taken as the baseline for advanced proficiency, 30-39 were taken as the baseline for intermediate level and scores between 0-29 were taken as the baseline for low proficiency.

The simultaneous bilinguals were second-generation immigrants born and raised in Peru (8 males and 3 females, age range, 18–47; M = 25, SD = 8.39). Most of them (82%) had attended or were attending college at the time of the study; the rest of them had completed high school. All of them had been educated in Peru and 64% had received instruction in the Chinese language in Lima (Chinese language schools). The participants were highly proficient in Spanish (M = 46.8, SD = 1.89). Regarding their patterns of language use, all of them reported speaking Spanish only, or mostly Spanish, at school. All but one of them either used both languages or spoke more Spanish at home. Only one participant reported only speaking Chinese at home. Regarding work and social situations, only one participant used only Chinese at work. However, at the same time, this participant reported feeling more comfortable in Spanish. Nine of the simultaneous bilinguals used only Spanish, or more Spanish, in social situations, and all but one of them reported feeling more comfortable in Spanish.

The sequential bilingual group consisted of 10 participants (6 males and 4 females) who came to Peru before the age of 13 (from 6 to 13 years old, M = 10) (age range, 20–31, M = 24, SD = 3.99). Their length of residence (LOR) in Peru ranged from 7 to 22 years (M = 14, SD = 5.29). Most of them (90%) had completed a college education in Peru. Their DELE scores showed that they had an overall high proficiency in Spanish (M = 44.6, SD = 6.12), and all but one of them had received formal schooling in Chinese in China or at a language school in Lima. In contrast with the simultaneous bilinguals, seven of the sequential bilinguals reported speaking only Chinese, or more Chinese, at home; seven reported using only Spanish, or more Spanish, at work, including three who reported speaking only Spanish at work. In social situations, except for one participant who only uses Spanish, the participants were distributed almost evenly in the use of both languages (4/10), and in the use of more Spanish (5/10).

The L2 learners were college students majoring in Spanish at a large university located in Northern China (4 males and 17 females, age range, 19–21, M = 20, SD = 0.78). At the time of testing, the L2 learners were finishing their second year of university studies and none of them reported having lived in a Spanish-speaking country. Chinese was spoken in their communities, and it was the language used in their social circles. Their proficiency in Spanish was intermediate (M = 38.7, SD = 3.17). All the L2 learners had been exposed to Spanish in the classroom, and the variety learned was Peninsular Spanish, which is the variety most often taught in China (Rovira 2010). They had learned a standard clitic system in which the accusative clitics are specified for gender and case. Regarding their use of Spanish, these L2 learners had limited Spanish use at school, at work, or in other social contexts. A summary of the linguistic background information of the participants is provided in Table 1.

Table 1

Summary of the participants’ linguistic background.

Simultaneous bilinguals (n = 11) Sequential bilinguals (n =10) L2 learners(n = 21)
Age at testing M = 25
(SD = 8.39,
range, 18–47)
M = 24
(SD = 3.99;
range, 20–31)
M = 20
(SD = 0.78;
range, 19–21)
Origin 11 Peru 1 Hong Kong,
3 Taiwan,
6 Mainland China6
21 Mainland China
Highest education level attained:
– College level 9/11 9/10 21/21
– High school level 2/11 1/10
Mean DELE score 46.8 44.6 38.9

4.2 The task and data coding

Data collection took place via an elicited narration of the fairytale Little Red Riding Hood. Narrative elicitation is a standard method of investigating grammatical knowledge or language growth (Sebastian & Slobin 1994; Cuza 2010; Pérez-Leroux & Castilla & Brunnera 2012; Rojas & Iglesias 2013). This technique has also been successfully implemented by Montrul (2010) to examine object expression among heritage speakers and L2 learners of Spanish.

The participants were provided with a wordless storybook, and they were asked to retell the story in Spanish at their own pace. The narratives were recorded and transcribed for further analysis. Following previous work with L2 learners of Spanish (Cuza 2010), the story was divided into six situational frames which depicted the main plot of the story. To count the instances of clitic use, we first calculated the number of instances where it would have been possible to use a clitic by each participant per situational frame. This was considered the number of potentially anaphoric direct object realizations. The calculation excluded non-anaphoric direct objects. Furthermore, we excluded from the total of realizations of potentially anaphoric direct objects the DPs that were far from the antecedent in the narrative. All of the instances of non-canonical clitic use (i.e., gender mismatches or use of dative le instead of accusative lo or la) and all the instances of clitic clusters were also calculated. The data coding is illustrated in the excerpts below:

    1. (10)
    1. a.
    1. El
    2. art-ms
    1. labo (lobo)
    2. wolf-ms
    1. vino
    2. come-pst.3s
    1. a
    2. to
    1. la
    2. art-fs
    1. casa
    2. house-fs
    1. de
    2. of
    1. la
    2. art-fs
    1. abuela
    2. grandma-fs
    1. y
    2. and
    1. la
    2. art-fs
    1. abuela
    2. grandma-fs
    1. está
    2. be- pres.3s
    1. guardando
    2. keep-gerun
    1. en
    2. in
    1. la
    2. art-fs
    1. cama …
    2. bed-fs
    1. El
    2.    art-ms
    1. labo (lobo)
    2. wolf-ms
    1. mata
    2. kill-pres.3s
    1. a
    2. dom
    1. la
    2. art-fs
    1. abuela
    2. grandma-fs
    1. y
    2. and
    1. la
    2. cl.3fs
    1. devoró.
    2. devour-pst.3s
    1. ‘The wolf came into the grandmother’s house while she was in bed. The wolf kills the grandmother and devoured her.’
    2.                                                                                                         (Participant L11)
    1. b.
    1. …Y      Caperucita
    2.    And the Little Red Riding Hood
    1. se
    2. cl
    1. puso
    2. put-pst.3s
    1. a
    2. to
    1. caminar
    2. walk-inf
    1. y
    2. and
    1. … quitó
    2.     remove-pst.3s
    1. unas
    2. some
    1. flores
    2. flower-fpl
    1. para
    2. to
    1. regar (regalar)
    2. give-inf
    1. a
    2. to
    1. su
    2. her
    1. abuela.
    2. grandma-fs
    1. ‘and the Little Red Riding Hood started to walk and picked some flowers for her grandma.’
    2.                                                                                                         (Participant L12)

As shown in (10a) and (10b), there were two instances of potentially anaphoric direct objects in the excerpt of the participant L11: the DP la abuela ‘the grandma’, and the clitic la, given that the referent had been introduced at the beginning of the situational frame. Hence, these two instances were coded as an instance of a DP object and an instance of a clitic, respectively. Similarly, the verb regalar ‘to give as a gift’ in the excerpt of participant L12 requires a direct object. In this context, a clitic is expected since the antecedent flores ‘flowers’ appears immediately in the main clause. Therefore, this case was deemed a realization of a null direct object.

To reduce a possible bias in data coding on the likelihood of clitic use, the coding was reviewed after an interval of at least two weeks. The discrepancy was trivial: it only consisted of 3 additional cases of DP objects as realizations of potentially anaphoric objects in the data of 3 participants from the group of L2 learners. The instances of each type of direct object (overt clitic, null object, and DP object) were then counted for each participant as the number of potentially anaphoric direct object realizations. The coding was then verified by a native speaker of Spanish, who also helped resolve discrepancies in the coding. The length of the narratives across all the participants varied between 66 words and 425 words. However, the average lengths of the narratives for the groups were close: 222 words for the simultaneous bilinguals, 185 words for the sequential bilinguals, and 195 words for the L2 learners.

4.3. Results

4.3.1. Number of potentially anaphoric direct object realizations and clitic realizations

Due to the nature of the data, the participants varied in the length of their narratives, and in their likelihood of using anaphoric direct objects. A direct comparison of the actual numbers of clitics produced by each group may not reflect the actual tendency for clitic use. Therefore, we first calculated the number of potentially anaphoric direct object realizations for each participant. As we discussed earlier, the number of potentially anaphoric direct object realizations only included potentially anaphoric objects that had an antecedent in the preceding discourse. Based on this, the proportions of clitics, DP objects and null objects were calculated for each participant and were compared across the groups. Some examples in the range of options for object realization are given in (10).

The number of potentially anaphoric object realizations differed across the groups.7 The simultaneous bilinguals produced more instances of potentially anaphoric objects (71 instances in total, 6.45 instances per participant) compared to the sequential bilinguals (35 instances in total, 3.5 instances per participant) and the L2 learners (60 instances in total, 2.86 instances per participant) (Table 2).

Table 2

Number of potentially anaphoric direct object realizations by group.

Number of potentially anaphoric direct object realizations Simultaneous Bilinguals (n = 11) Sequential Bilinguals (n = 10) L2 Learners (n = 21)
Total 71 35 60
Average per participant 6.45 3.5 2.86

To investigate whether there were differences across the groups, we conducted a Poisson regression for the analysis. In the Poisson regression, group was entered as a factor with three levels (simultaneous bilinguals, sequential bilinguals, and L2 learners), and the DELE proficiency scores were added as a covariate to control for proficiency. The response variable in the model was set to be the number of potentially anaphoric direct object realizations. We used a Poisson regression because the response variable was a count variable, and it follows a Poisson distribution rather than a normal distribution. The alpha was set at 0.05 for the threshold of significance.

The results showed an effect of group (F (2, 38) = 13, p < .001) and of proficiency (F (1, 38) = 5.36, p < .05). This suggests the groups differed in their likelihood of producing anaphoric objects, and that this likelihood was affected by the participants’ proficiency scores. Since a significant effect was found for group, we implemented various post hoc tests to investigate which groups were different from the others. The post hoc analysis for group was specified using LS-Means and adjusted for multiple comparisons using the Tukey method. Results showed a significant difference between the L2 learners and the simultaneous bilinguals (β = –1.16, SE = .23, t = –5.00, p < .0001), and between the sequential and the simultaneous bilinguals (β = –.74, SE = .22, t = –3.37, p < .01). Indeed, as can be seen in Table 2, the average number of potentially anaphoric direct objects produced per participant was higher for the simultaneous bilinguals than for the other two groups. The results also showed no significant difference between the sequential bilinguals and the L2 learners (p = .16). The two groups behaved similarly regarding object realization. Regarding the effect of proficiency, the results suggest that the participants with higher proficiency tended to use more anaphoric elements in their narratives. The simultaneous bilinguals produced more anaphoric direct objects than the other two groups.

Having compared the number of potentially anaphoric object realizations per group, we then focused on the bilinguals’ actual proportion of clitic realizations, in contrast to other strategies available in Spanish including DPs and null objects. Unlike previous work with elicited production (Cuza et al. 2013), we found that the participants barely produced null objects in their narratives. The simultaneous bilinguals produced zero null objects whereas the sequential bilinguals and L2 learners produced 3 and 4 null objects in total, respectively. Hence, we only considered clitics and overt DPs in further analysis. Table 3 shows the total number of clitics, DPs, and null objects in the data per group within the potential realization contexts.

Table 3

Number of object realizations (DPs, null objects, clitics) by group.

Group clitic null objects overt DP
Simultaneous bilinguals 69 (96%) 0 2 (4%)
Sequential bilinguals 25 (71%) 3 (6%) 7 (23%)
L2 learners 38 (63%) 4 (7%) 18 (30%)

To analyze the production of clitics and overt DPs, we considered the fact that the number of object realizations was not constant across different groups and subjects. A standard logistic regression assumes that there is a constant number of trials for each subject, and the goal is to model the number of “successes” or “failures” out of the number of trials (i.e., the two possibilities of producing either a clitic or a DP in each potential clitic locus). However, because the number of trials (i.e., “number of potentially anaphoric direct object realizations”) was different for each subject, we needed to use a logistic regression that conditions success and failure on the number of trials. Thus, a percentage of clitic/DP production over the total number of realizations of potentially anaphoric objects was calculated per participant for normalization. Since the data exhibited the structure of a repeated measures design, we specified a repeated measures logistic regression. The response variable was the normalized number of clitics/DPs produced. Group was a factor with three levels (simultaneous bilinguals, sequential bilinguals, and L2 learners). Type was specified as another factor with two levels (clitic/DP), and the interaction between type and group as another factor. Finally, the proficiency scores were added as a covariate.

The results revealed a significant interaction between group and type (F (2, 22) = 12.30, p < .001). therefore, we used a series of post hoc analyses adjusted for multiple comparisons to investigate the nature of the interaction. For clitic production, we found no significant difference between the L2 learners and the sequential bilinguals (p = .38) but we did find a significant difference between the L2 learners and the simultaneous bilinguals (β = –3.14, SE = .83, t = –3.77, p < .01) with the simultaneous bilinguals producing more clitics. We also found a significant difference between the sequential and simultaneous bilinguals (β = –2.71, SE = .85, t = –3.19, p < .01) with the simultaneous bilinguals producing more clitics. Correspondingly, for DP productions, we found no significant difference between the L2 learners and the sequential bilinguals (p = .36). There was a significant difference between the L2 learners and the simultaneous bilinguals (β = 2.44, SE = .83, t = 2.92, p < .01), with the simultaneous bilinguals producing fewer DPs. A significant difference was also found between the two bilingual groups (β = 1.94, SE = .88, t = 2.20, p < .05), with simultaneous bilinguals producing fewer DPs (Figure 1).

Figure 1
Figure 1

Proportion of clitics and DPs realized by simultaneous bilinguals (SIM), sequential bilinguals (SEC), and L2 Learners (L2) group.

To confirm the group results, we implemented an individual analysis in which the participants were grouped into five categories according to their percentages of clitics produced: full production (100%), upper range (71–99%), middle range (31–70%), low range (1–30%), and zero production (0%). The results largely confirmed the group analysis: all the simultaneous bilinguals clustered in the range of full production and the upper range, showing a reliance on the morphological form of clitics to express direct objects. The percentages of the sequential bilinguals and the L2 learners in the range of full production were comparable (30% of the sequential bilinguals vs. 29% of the L2 learners). Regarding these two groups, although the middle range was the range that included more participants compared to other ranges (60% of sequential bilinguals and 38% of L2 learners), all the sequential bilinguals in this range produced clitics in more than 60% of the occasions whereas the L2 learners showed a wider distribution (25%–67%). Another difference between the two groups was that there were more L2 learners (19%) than sequential bilinguals (10%) who did not produce any clitic at all. The individual analysis is shown in Table 4.

Table 4

Individual analysis: percentages of clitic production across the groups.

Group Range % clitic production # of participants
Simultaneous bilinguals full production 100% 9/11 (82%)
upper range 71–99% 2/11 (18%)
middle range 31–70% 0
low range 1–30% 0
zero production 0% 0
Sequential bilinguals full production 100% 3/10 (30%)
upper range 71–99% 0
middle range 31–70% 6/10 (60%)
low range 1–30% 0
zero production 0% 1/10 (10%)
L2 learners full production 100% 6/21 (29%)
upper range 71–99% 2/21 (10%)
middle range 31–70% 8/21 (38%)
low range 1–30% 1/21 (5%)
zero production 0% 4/21 (19%)

4.3.2. Non-canonical use of clitics

The second issue we examined was the non-canonical use of clitics among the three groups of Chinese/Spanish bilinguals. There were only two types of non-canonical clitics: using the dative clitic form le instead of the accusative clitics and using the masculine form lo instead of the feminine form la. This is consistent with the dialectal situation in Lima and previous work by Mayer & Sánchez (2017) on Spanish in contact with indigenous languages. However, the group of L2 learners in China also showed this pattern and the simultaneous bilinguals showed a striking convergence with the prestigious norm in Lima that marks gender. The numbers of instances of non-canonical clitics produced by the participants are shown in Table 5.

Table 5

Instances of non-canonical clitics produced by group.

Group Le for lo or la Lo for la
Simultaneous bilinguals 1/69 (1.5%) 2/69 (3%)
Sequential bilinguals 3/25 (12%) 5/25 (15%)
L2 learners 7/38 (18%) 5/38 (13%)

The percentage of non-canonical clitic use was very low for the simultaneous bilinguals compared to the sequential bilinguals and the L2 learners. Given the low numbers of non-canonical clitic use, we analyzed the group differences qualitatively via an individual analysis. The L2 learners were the least accurate group showing a marked use of the dative form le over the masculine form lo; this preference of clitic le use was not observed among the sequential bilinguals (Table 6).

Table 6

Number of participants with non-canonical clitic use by group.

Group # of participants with non-canonical clitics Type of use
le for lo/la lo for la
Simultaneous bilinguals 27% (3/11) 9% (1/11) 18% (2/11)
Sequential bilinguals 78% (7/9) 44% (4/9) 44% (4/9)
L2 learners 53% (9/17) 35% (6/17) 18% (3/17)

There were three simultaneous bilinguals who used non-canonical clitics. Each of them only exhibited one instance of non-canonical use, and all the non-canonical clitics appeared in complex structures, such as in clitic clusters (e.g., se lo comió ‘the wolf ate her up’ (‘her’ meaning the grandma), Participant S3, and se lo llevó ‘she took the basket with her’, Participant S11) or in proclitic position in a reconstruction structure (e.g., le quiere seguir ‘the wolf wants to follow her’, Participant S9).

Regarding the sequential bilinguals, they showed both patterns of non-canonical use, but with no clear preference. One participant (1/7) produced both le and lo. Among the four sequential bilinguals that used le, two of them (2/4) used le with psych verbs (e.g., le engañó ‘the wolf tricked her’ Participant C5, le amenazó ‘the wolf threatened her’ Participant C7). These verbs belong to the Class II Psych-verbs (Belletti & Rizzi 1988; Parodi-Lewin 1991) that may take an indirect object (doubled by the dative clitic le) when the predicate is understood as stative. Considering the fact that they used le in the psych verbs that denote events, this usage was deemed non-canonical. Moreover, all the non-canonical uses of lo produced by the sequential bilinguals were in the proclitic position when in structures with a possibility of clitic climbing.

The L2 learners showed the same pattern of clitic use but none of them exhibited both types of non-canonical clitics. They showed a stronger preference for le than the sequential bilinguals. One participant also used le with the verb asustar (‘to scare’) with an eventive interpretation (e.g., el lobo se levantó y le asustó mucho a Caperucita ‘the wolf got up and scared the Little Red Riding Hood a lot’). The verb asustar ‘to frighten’ is also a type II psych verb that receives different readings with an accusative or a dative argument (e.g., stative reading in A Juan le asustan las tormentas ‘Juan is scared of storms’ vs. eventive reading in Juan lo asustó ‘Juan scared him’). In this case, the form le was deemed a non-canonical use of an accusative clitic due to the eventive reading.

All the other instances of le produced by the L2 learners were with a normal transitive verb. It is worth noting that, in all the instances across the groups, le was only used to refer to humans, namely the girl, the grandma, or the hunter. The wolf was never referred to with le. There were fewer instances of lo compared to le when the referent was animate. Moreover, these results showed that the classroom L2 learners with later and limited exposure to Spanish appeared to prefer the form with no gender specification (i.e., le), unlike the other groups.

4.3.3 Production of accusative clitics in complex structures

Regarding the production of accusative clitics in contexts when a dative or reflexive se is also present, results showed that 72% (8/11) of the simultaneous bilinguals produced a total of 11 clusters of an accusative clitic and a dative (e.g., se lo entregó ‘she gave it to her’, Participant S7), reflexive (e.g., se la llevó ‘she took it with her’ by Participant S10), or aspectual se (e.g., se la comió ‘he ate her up’ by Participants S1, S2, S3 and S4). Regarding the verbs, there were two ditransitive verbs used by the simultaneous bilinguals: entregar ‘to give’ and llevar ‘to take’. The only reflexive verb used was llevarse ‘to take something oneself’. Aspectual se was exclusively associated with comerse ‘to eat up’. Despite low numbers of instances in each category, all three types of clitic clusters were found among the simultaneous bilinguals. For the sequential bilinguals, the proportion of participants who used clitic clusters was 30% (3/10) with a total number of 5 clusters produced exclusively with aspectual se associated to the verb comer. The L2 learners did not produce any clitic clusters of any type. Among the participants who produced clitic clusters, seven out of eight simultaneous bilinguals and one out of three sequential bilinguals produced the se-ACC.-verb order, and the accusative clitic used by the sequential bilingual was the non-canonical use of lo (se lo (la) comió ‘(the wolf) ate her up’). Two out of the three instances of non-canonical uses of clitics for the simultaneous bilinguals were in clitic clusters. One simultaneous bilingual and each of the three sequential bilinguals produced the word order of verb-se-ACC (llevársela ‘take her away’). Additionally, a combination of se with an accusative le was not found in the data. Table 7 shows the number and type of clitic clusters produced by the simultaneous and the sequential bilinguals.

Table 7

Number and type of clitic clusters produced by each group.

Group Number of participants with clitic cluster production Number of clitic clusters produced for each type
verb- se- ACC. se-ACC.-verb
Simultaneous bilinguals 8/11 (72%) 1/11 (10%) 10/11 (90%)
Sequential bilinguals 3/10 (30%) 4/5 (80%) 1/5 (20%)

To summarize, we examined three aspects of accusative clitic use: the proportion of object clitics produced, the accuracy of clitic use, and the production of clitic clusters. The simultaneous bilinguals showed clear advantages compared to the other groups. They produced most canonical clitics in the permitting contexts, and they were also able to combine accusative clitics with dative or reflexive se, showing command of the syntactic functions of the clitics. The sequential bilinguals showed production of clitics in most of the possible instances, and they also produced clitic clusters but to a lesser extent. Most of the sequential bilinguals showed non-canonical clitic use which was more likely to occur in clitic clusters. The L2 learners performed as the sequential bilinguals did with clitic production and showed non-canonical clitic use. However, they did not produce any clitic clusters.

5. Discussion and conclusions

We examined the production of Spanish accusative clitics in narrative data from Chinese/Spanish bilinguals in Peru and classroom learners in China. Our three hypotheses focused on three main issues: the availability of a system of accusative clitics, the complexity of feature specifications of such a system, and the syntactic complexity of the system. These grammatical structures are not instantiated in Chinese, leading to potential crosslinguistic influence. Therefore, this study focused on the effect of early exposure and naturalistic input in overcoming acquisition difficulties.

Hypothesis 1 predicted that the learners with later and limited exposure to Spanish would show more difficulty with the target production of accusative clitics as a strategy of object expression. This hypothesis was partially supported in our data. The results showed that the simultaneous bilinguals outperformed the other groups regarding the number of clitics produced. However, adult L2 learners showed similarities to the sequential bilinguals who were exposed to Spanish before the age of 13. The contrast between the simultaneous bilinguals and the sequential bilinguals clearly showed the robust effect of early exposure, supporting previous findings (Cuza et al. 2013). The data also support previous work on the relatively successful acquisition of clitic placement (Duffield & White 1999; Montrul 2010; Rossi et al. 2017). Regarding the realization of direct objects, the sequential bilinguals and the L2 learners did not show many instances of null objects but rather more reliance on full DPs. Considering the difference between Chinese and Spanish regarding object drop, the results suggest that they might be aware of the non-topic-drop nature of Spanish before mastering the use of clitics as a strategy of object realization. If using an anaphoric strategy suggests more discourse coherence, the narrative data from the simultaneous bilinguals suggest an advantage in this respect, showing more potential instances for the production of anaphoric objects and the frequent use of clitics in these situations. This was not seen among the sequential bilinguals and the L2 learners. The lower degree of coherence exhibited by these groups was also manifested in their preference for DP subjects over accented pronouns in subject position, as shown in (10) above. Future work would benefit from examining the data from the perspective of discourse. As pointed out by a reviewer, the participants might be influenced by their knowledge of Chinese since the sequential bilinguals and the L2 learners preferred a full DP strategy for object realization rather than using anaphoric clitics, especially the L2 learners, as both null objects and full DPs are grammatical options in Chinese. Given the ungrammatical status of null clitics in Spanish, it is logical that the sequential bilinguals and the L2 learners resort to full DPs as a “safe” strategy when they have not fully mastered clitic use. This suggests that the automaticity of using Spanish anaphoric mechanisms is related to early and naturalistic exposure.

The results also showed that the lack of naturalistic exposure seems to be compensated to some extent for the L2 learners, who learned Spanish mainly through classroom instruction. Regarding the average number of possible realizations of objects (as in Table 2), and the total number of clitics produced (as in Table 3), the L2 learners showed similar performance to the sequential bilinguals who acquired Spanish in a naturalistic setting. This suggests similarity between the L2 learners and the sequential bilinguals in relation to their sensitivity to object clitic use, and that there is the possibility for classroom instruction to compensate for late exposure and non-naturalistic exposure. Nevertheless, the L2 learners produced more DPs compared to the sequential bilinguals, as shown in Table 3 and in the individual analysis. This suggests an avoidance strategy by the classroom learners.

Hypothesis 2 predicted that the bilinguals with later or limited exposure to Spanish would have a simpler clitic system regarding gender features. Although this hypothesis was not supported by the statistical analysis due to relatively low numbers of instances of non-canonical clitics, an individual analysis showed similarity between sequential bilinguals and L2 learners regarding non-canonical clitic use. This was manifested in the similar patterns of non-canonical clitic use (i.e., using le instead of lo/la, and using lo instead of la), and the higher number of non-canonical forms compared to the simultaneous bilinguals. This supports previous studies showing difficulties with gender specification in Spanish clitics (e.g., Klee 1989; Arche & Dominguez 2011; Rossi et al. 2014). Interestingly, a higher percentage of the sequential bilinguals (78%) showed non-canonical forms, compared to the L2 learners (53%). This suggests a positive effect of classroom instruction in mitigating late and limited exposure to the language. One thing to note is that more classroom learners preferred the form le with no gender specification and the sequential bilinguals showed both lo and le forms. This suggests that the inventory of clitic forms for the sequential bilinguals is richer than that of the L2 learners. One may argue that the non-canonical clitics of the sequential bilinguals could also stem from variable input, since the contact varieties of Spanish in Lima show gender simplification. However, if the immigrants in Lima were influenced by exposure to contact varieties, the simultaneous bilinguals would have also shown simplification of gender marking in their clitic system, which was not the case. Therefore, it might be that while the L2 learners showed an avoidance of gender marking, the sequential bilinguals exhibited underspecification of gender features. Due to the relatively low number of instances, future research is needed to determine the exact reason for group differences.

In general, the lack of gender specification by the sequential bilinguals and the L2 learners is consistent with previous research with learners whose L1 is in contact with Spanish (Mayer & Sánchez 2017). As pointed out by a reviewer, gender simplification might stem from difficulties with grammatical gender and not resemble the case of gender simplification in Andean Spanish. Previous research has revealed that grammatical gender errors persist even for advanced L2 speakers (Rossi et al. 2014). Thus, the results from the sequential bilinguals and the L2 learners suggest that difficulties with gender marking in clitic use persist in both instructional settings and language contact environments. On the other hand, the data also suggest that earlier exposure to Spanish favors the development of the gender marking system, as evidenced in the different patterns of non-canonical clitic use across the groups.

Hypothesis 3 predicted that the bilinguals with later or limited exposure to Spanish would have difficulty with the production of clitic clusters. This hypothesis was supported by the data. The results suggest that producing accusative clitics together with se seems to be the most demanding task. The results showed difficulty among the L2 learners with more complex structures, supporting previous work (e.g., Montrul 2010 for clitic doubling; Sánchez & Al-Kasey 1999 and Cuza et al. 2013 for CLLD). Specifically, the results showed an advantage of earlier exposure. This was seen from the number of instances and the variability in clitic clusters in terms of word order and verbs associated. Regarding the number of instances of clitic clusters, only the simultaneous and the sequential bilinguals showed production of clitic clusters. Fewer sequential bilinguals produced clitic clusters (3/10) in comparison to the simultaneous bilinguals (8/11), which suggests a role for early exposure in the production of complex clitic structures.

Regarding the variability in clitic clusters, the simultaneous bilinguals produced clitic clusters for dative se, reflexive se and aspectual se with both the word orders se-ACC.-verb and verb-se-ACC., showing a command of the complex structure involving two clitics. In contrast, only one of the three sequential bilinguals (C3) showed a se-ACC.-verb word order. Furthermore, two of the three instances of non-canonical clitics produced by the simultaneous bilinguals were found in clitic clusters. This suggests the difficulty of coupling the syntax and the morphological representation of the gender features at the same time in such structures.

Note that the word order in clitic clusters is different from Chinese, a language that shows SVO order in simple sentences. Word order in clitic clusters in Spanish is largely determined by the complexity of feature specification (Heap 2005). The difficulty shown by the sequential bilinguals and the L2 learners might be due to the underspecification of the case or gender features encoded in the Spanish clitics. Additionally, structures involving two 3rd person pronouns (e.g., I gave it to her) are scarce in Chinese. This is partially due to the pronominal system in which the 3rd person pronouns (i.e., ta ‘she, he, her, him, it’) are not distinguished by case, animacy, or gender at the phonological level, and partially due to the topic-drop property of the language (Li & Thompson 1976; Huang 1982). Hence, producing complex structures as clitic clusters might be more demanding for native speakers of Chinese. Given the characteristics of Chinese, postverbal clusters may pose fewer difficulties for the participant as they are less complex. The results of the present study are consistent with previous studies regarding the effect of structure complexity (e.g., Jakubowicz & Strik 2008; Cuza 2013).8

As discussed above, the acquisition of Spanish among Chinese-speaking learners seems to be affected by crosslinguistic influence leading to convergence and simplification (e.g., Clements 2009; Cuza et al. 2013). L2 learners appear to fluctuate in the bilingual continuum depending on their specific learning conditions, input quantity and quality, and their age of onset of bilingualism. Sequential bilinguals often show difficulties with clitic use and distribution even after years of exposure to the language in an immersion context. Crosslinguistic influence might be even more pronounced for classroom L2 learners in China due to less exposure and use of Spanish outside the classroom. This was observed in the results of this study, as the L2 learners relied more on DPs in object realization, and they avoided using more demanding structures like clitic clusters.

In conclusion, the results from this study suggest similarity between sequential Chinese/Spanish bilinguals in a naturalistic setting and classroom L2 learners regarding clitic use. Classroom instruction seems to compensate for lack of naturalistic exposure and age of acquisition effects among the L2 learners. The L2 learners had lower proficiency scores compared to the sequential bilinguals, and they had never been exposed to Spanish in an immersion context. However, they behaved similarly to the sequential bilinguals in Lima. This suggests a positive role for instruction in developing sensitivity to clitic use and gender features. However, given the relatively short instruction time (2 years), the effect of classroom instruction was not observed in the development of target-like gender features or in the target production of clitic clusters. In contrast, earlier age of onset showed an effect on the acquisition of the gender feature of the clitics and the use of clitic clusters. From the perspective of coherence of discourse, the narratives of both the sequential bilinguals and the L2 learners are qualitatively different from those of the simultaneous bilinguals in that they included fewer anaphoric expressions in direct object positions. This was specifically the case among the L2 learners who avoided clitic clusters where the referents of the two clitics had to be anchored in the previous discourse context. Some of the sequential bilinguals appeared to be able to express the two objects with clitics in a cluster, but they differed from the simultaneous bilinguals regarding the complexity of the clitic system. Future research would benefit from a direct comparison between adult L2 learners from naturalistic versus classroom settings with similar levels of L2 proficiency to examine the effect of the learning environment.


1 = First Person

2 = Second Person

3 = Third Person

acc = Accusative

art = Article

cl = Clitic

dat = Dative

dom = Differential Object Marking

f = Femenine

m = Masculine

perf = Perfective Particle

pl = Plural

poss = Possesive

pst = Past Tense

q = Question Particle

refl = Reflexive

s = Singular


  1. We use the term Chinese in general to refer to both Mandarin and Cantonese. The structure under examination behaves the same in both languages. [^]
  2. Chinese does not exhibit grammatical number in general. However, the nominal suffix ‘-men’ can be added to plural definite animate nouns. [^]
  3. Number specification on the clitics was not analyzed given the lack of enough items in semi-spontaneous production to control for singular versus plural forms. [^]
  4. The Roman numerals correspond to the person feature of the clitics. [^]
  5. The educated norm of Limeño Spanish instantiates gender marking on the clitic. Furthermore, the majority of the bilinguals living in Lima had college education or completed high school. Hence, a distinction of lo and la is expected following previous studies (Klee & Caravedo 2005; Mayer & Sánchez 2017). [^]
  6. The varieties of Chinese spoken by the participants from in Hong Kong, Taiwan, and Mainland China do not differ in the structures under examination. They do not exhibit Spanish-type accusative clitics or gender specification. [^]
  7. The number of potentially anaphoric direct object realizations was the sum of clitics, null objects, and DPs. [^]
  8. However, as pointed out by a reviewer, the avoidance of producing clitic clusters might be due to the semi-spontaneous nature of the task and cannot be interpreted as the absence of the structure in their grammar. Further research with designed experimental tests is needed to explore this issue. [^]

