1 Introduction

Does Modern Standard Arabic (MSA) have obligatory control predicates? MSA has several embedded clause constructions, some of which resemble control in English (and other languages). However, these constructions exhibit some notable differences. Chief among them is the fact that the embedded verb carries agreement features that can indicate the agreement properties of its understood subject.

We distinguish in this paper between obligatory control (OC) and no control (NC). OC generally refers to a situation whereby an unexpressed subject of a complement clause is obligatorily identified with (or controlled by) a matrix argument (subject or object). In some languages this relation also applies to a reversed situation, where the unexpressed subject is in the matrix clause, and its obligatory controller is the subject of the complement clause. Both cases, referred to in the literature as “forward control” and “backward control”, respectively, are considered here as OC. NC, on the other hand, refers to a situation whereby the reference of an unexpressed subject is not dependent on the reference of another argument.

The first goal of this paper is to investigate whether all verbs in these MSA constructions are NC predicates, which allow for both coreference and disjoint reference, or whether there are OC predicates which enforce coreference between the subject of the embedded clause and a matrix argument. In order to determine whether OC predicates exist in the language we conducted a thorough corpus-based search of such constructions. This empirical investigation was informed by previous insights regarding the distinction between OC and NC predicates in MSA and in Modern Greek and by more general typological predictions. Surprisingly, we found no evidence of OC predicates in MSA; our findings contradict accepted generalizations (and predictions) proposed by state-of-the-art theories of control.

The second goal of the paper is to propose a formal analysis of the control-like MSA constructions. Under the assumption that there is no OC in MSA, a straightforward account is to propose one structure for all cases, namely, a no-control structure. There is, however one pattern which challenges this account: the backward pattern, where the matrix clause lacks an overt subject and an overt subject is found in the embedded clause. We find that an agreement alternation exhibited by this pattern correlates with the OC/NC distinction. From a theoretical perspective, the existence of backward control in MSA throws light on the relation between control, restructuring and raising, in this language as well as across languages.

The structure of the paper is as follows. We begin Section 2 by briefly reviewing some basic properties of MSA that are relevant to the current study and proceed to discuss in more depth ʔan clauses, which are the ones that resemble control constructions. In Section 3 we review previous proposals that aim to distinguish between OC and NC predicates. In Section 4 we present corpus findings suggesting that predicates which are typically OC predicates in other languages, are NC predicates in MSA. Section 5 begins by discussing different aspects related to a formal analysis of MSA ʔan clauses and continues with crucial corpus findings which reveal the syntactic reflexes of the OC/NC distinction in the backward pattern. We subsequently discuss the implications of this construction on the theoretical debate regarding control phenomena, examining how it sheds light on different analyses of control (PRO-based and Movement), obviation and restructuring. Finally, our analysis is presented in Section 6, where we identify predicates that license backward control as restructuring predicates and explain how this characterization, when incorporated into an analysis inspired by the movement analysis of control, can explain the MSA complex data.

2 Background

Modern Standard Arabic (MSA) is the shared language of the Arab world, but it is a written language, which is spoken only in formal scripted contexts and learned as a second language in school. The first language of MSA speakers is a regional dialect, which is spoken but rarely written. This makes linguistic analysis of MSA challenging: it is hard (though not impossible) to obtain “native” speaker judgments; and corpus-based approaches, our main methodology herein, can be hampered by the influences of authors’ native dialects on their MSA production. Still, given the large amount of data available for MSA, and the fact that many speakers are highly competent in the language, corpus analysis augmented by near-native judgments provides a solid framework for in-depth investigation.

2.1 Word order and agreement in Modern Standard Arabic

MSA is a pro-drop language whose unmarked word order is verb–subject–object (VSO), yet subject–verb–object (SVO) order is also available. While the two word orders are possible, each is associated with a different agreement pattern. Post-verbal subjects trigger partial agreement (PA) on the verb, which only involves gender and person, while the number feature of VSO verbs is invariably singular (1a). Conversely, pre-verbal subjects trigger full agreement (FA) on the verb (1b).1

    1. (1)
    1. a.
    1. katabat
    2. wrote.3SF
    1. tʕ-tʕaalibaat-u
    2. the-students.PF-NOM
    1. maqaal-an.
    2. article-ACC
    1.  
    1. b.
    1. ʔatʕ-tʕaalibaat-u
    2. the-students.PF-NOM
    1. katabna
    2. wrote.3PF
    1. maqaal-an.
    2. article-ACC
    1. ‘The female students wrote an article.’

When pronominal subjects are dropped the verb exhibits full agreement.

    1. (2)
    1. katabat
    2. wrote.3SF
    1. maqaal-an.
    2. article-ACC
    1. ‘She wrote an article.’ (Not: ‘They wrote an article.’)

The FA/PA distinction, which is determined by the position of the subject relative to the verb, is only discernable with plural human subjects (as in (1)). Plural inanimate subjects always trigger singular–feminine agreement (3).

    1. (3)
    1. ʔal-kutub-u
    2. the-books-NOM
    1. l-qadiimat-u
    2. the-old.SF-NOM
    1. suriqat.
    2. were.stolen.3SF
    1. ‘The old books were stolen.’

2.2 Complement clauses

MSA has two types of embedded clauses, introduced by two principal particles: ʔan and ʔanna. Consider (4a) and (4b).2

    1. (4)
    1. a.
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad tried to write an article.’
    1.  
    1. b.
    1. ballaɣa-nii
    2. informed.3SM-to me
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔanna
    2. that
    1. l-baaħiθ-a
    2. the-researcher-ACC
    1. sa-yaktubu
    2. will-write.3SM.IND
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad informed me that the researcher would write an article.’

The two types of embedded clauses differ in three main respects.3 First, different types of predicates typically select for each clause type. ʔan clauses are embedded as complements of predicates that express some attitude towards the event described in the embedded clause. These clauses are in the focus of this paper. ʔanna clauses, on the other hand, are embedded as complements of cognition/perception predicates (e.g., raʔaa ‘see’) or utterance predicates (e.g., sʕarraħa ‘declare’), and describe factual events. Second, the two types of clauses demonstrate different word orders; while in ʔan clauses the verb appears clause-initially, in ʔanna clauses the subject appears clause-initially and bears accusative Case.4 Third, in ʔan clauses the verb appears in the subjunctive mood, while in ʔanna clauses the verb appears in the indicative mood (perfect or imperfect).5

2.3 ʔan clauses

ʔan clauses are selected as complement clauses by a particular set of verbs.6 Alternatively, the same verbs can take as complements NPs headed by the verbal-noun (masʕdar) counterpart of the subjunctive verb (e.g., (4a) and (5)). For this reason, when ʔan introduces clauses that can be replaced by verbal-noun phrases, it is called ʔan ʔal-masʕdariyya ‘the verbal noun ʔan’. We will focus in this paper on verbal ʔan clauses.

    1. (5)
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. kitaabat-a
    2. writing-ACC
    1. l-maqaal-i.
    2. article-GEN
    1. ‘Muhammad tried writing an article.’

Typically, ʔan clauses appear with no overt subject, yet their unexpressed subject is construed as an argument of the matrix verb. These cases are similar to familiar control constructions in English (and other languages), yet unlike English, the agreement marking on the subjunctive verb reveals the agreement properties of the intended subject. For example, in (6a) the subjunctive verb yaktuba ‘write’ agrees with the matrix subject, Muhammad, which is its understood subject. On the other hand, in (6b), the subjunctive verb taktuba ‘write’ agrees with Hind, the feminine complement of the matrix verb ʔaqnaʕa ‘convince’, similarly to an object control construction.

    1. (6)
    1. a.
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad tried to write an article.’
    1.  
    1. b.
    1. ʔaqnaʕa
    2. convinced.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. hind-an
    2. Hind-ACC(F)
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad convinced Hind to write an article.’

There are strict adjacency conditions with respect to the linear position of ʔan and the subjunctive verb. The only element which can intervene is negation ((7b) & (7c)). Note that in sentence (7c) the ʔan and negation are contracted into one orthographic word.

    1. (7)
    1. a.
    1. *ħaawala
    2.   tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. l-yawm-a
    2. today-ACC
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1.  
    1. b.
    1.   ħaawala
    2.   tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. laa
    2. not
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1.  
    1. c.
    1.   ħaawala
    2.   tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔal-laa
    2. AN-not
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad tried not to write an article.’

Finally, when ʔan clauses are coordinated ʔan can either take scope over the coordination (8a) or precede each subjunctive verb separately (8b).

    1. (8)
    1. a.
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an
    2. article-ACC
    1. wa-yasiira].
    2. and-go.3SM.SBJ
    1.  
    1. b.
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. maqaal-an
    2. article-ACC
    1. wa-ʔan
    2. and-AN
    1. yasiira].
    2. go.3SM.SBJ
    1. ‘Muhammad tried to write an article and go.’

2.3.1 Agreement

In the aforementioned examples the understood subject of the embedded subjunctive verb is construed as either the matrix subject (6a) or the matrix object (6b), and consequently they receive a “control” interpretation, parallel to the interpretation we find in English control clauses containing an infinitive form. Nevertheless, the understood subject of ʔan clauses does not necessarily corefer with a matrix argument, and consequently, the subjunctive verb may not exhibit agreement with the matrix subject (and, in turn, with the matrix verb). Consider (9).

    1. (9)
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad tried that she would write an article’.

The matrix verb bears third-person-singular-masculine (3SM) agreement, while the embedded subjunctive verb bears third-person-singular-feminine (3SF) agreement. Therefore, coreference between the subject of the embedded clause and the matrix subject is impossible. The understood subject of the complement clause is not mentioned in the sentence, yet it is salient in the discourse.

Moreover, in cases in which the embedded verb and the matrix verb bear the same agreement features, coreference is not obligatory. Thus, (6a), repeated here as (10a), and (10b) are actually ambiguous; the understood subject can be the matrix subject or someone else.

    1. (10)
    1. a.
    1. ħaawala
    2. tried.3SM
    1. muħammad-uni
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. yaktubai/j
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammadi tried that hei/j would write an article.’
    1.  
    1. b.
    1. ħaawalai
    2. tried.3SM
    1. [ʔan
    2. AN
    1. yaktubai/j
    2. write.3SM.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘Hei tried that hei/j would write an article.’

2.3.2 Embedded subject

MSA ʔan clauses diverge even further from English infinitival control clauses in that they can contain overt subjects. Consider, for example, sentence (11), which is similar to (9), yet its embedded subject is expressed.

    1. (11)
    1. ħaawala
    2. tried.3SM
    1. muħammad-un
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. hind-un
    2. Hind-NOM(F)
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad tried that Hind would write an article.’

With two overt subjects with distinct agreement properties, there is obviously no coreference between the embedded subject and the matrix subject.

An additional configuration, which we will refer to as a backward pattern, is one where only the embedded subject is overt (12). In this case, similarly to (10a), its forward pattern counterpart, when the subjunctive verb and the matrix verb agree the sentence is ambiguous (12a); coreference of the two subjects is possible, but not obligatory. In (12b), where there is no agreement between the two predicates, there must be two distinct referents.

    1. (12)
    1. a.
    1. ħaawalai/j
    2. tried.3SM
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. muħammad-uni
    2. Muhammad-NOM(M)
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammadi tried that hei would write an article.’
    2. ‘Hej tried that Muhammadi would write an article.’
    1.  
    1. b.
    1. ħaawalatj
    2. tried.3SF
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. muħammad-uni
    2. Muhammad-NOM(M)
    1. maqaal-an].
    2. article-ACC
    1. ‘She tried that Muhammad would write an article.’

It is, however, impossible to have identical R-expressions as coreferring subjects in both positions (13).7

    1. (13)
    1. ħaawala
    2. tried.3SM
    1. muħammad-uni
    2. Muhammad-NOM(M)
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. muħammad-uni
    2. Muhammad-NOM(M)
    1. maqaal-an].
    2. article-ACC
    1. Intended: ‘Muhammadi tried that Muhammadi would write an article.’

One additional construction in which ʔan clauses appear is the impersonal construction illustrated in (14). The 3SM agreement on the matrix predicate is default agreement.

    1. (14)
    1. yaʒibu
    2. have-to.3SM
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. hind-un
    2. Hind-NOM(F)
    1. maqaal-an].
    2. article-ACC
    1. ‘It is necessary that Hind would write an article.’

In sum, subjunctive ʔan clauses in MSA differ from infinitival control clauses in English and other languages in four principal respects: (i) MSA ʔan clauses contain a finite subjunctive verb form; (ii) the subjunctive bears agreement features; (iii) the subject of the ʔan clause does not necessarily corefer with an argument of the matrix predicate; (iv) the ʔan clause can involve an overt embedded subject. Note that (iii) and (iv) are independent of each other; there can be an embedded subject in the ʔan clause or not, and this subject can corefer with the matrix subject or not. Nevertheless, there cannot be two identical coreferring R-expression subjects.

3 Distinguishing between obligatory control and no-control predicates

In the previous section we presented the different patterns in which the verb ħaawala ‘try’ can appear, and noted the similarities and differences between the MSA construction and the English infinitival control construction. Most significantly, we saw that unlike control predicates, the verb does not impose any restrictions on the reference of the subject of its complement clause. This raises the question whether this is characteristic of all ʔan-clause-taking predicates or whether some predicates are more restrictive in that they enforce coreference between a matrix argument and an embedded subject.

We begin by reviewing studies that address the distinction between OC and NC predicates, specifically in MSA and Modern Greek, and more generally, from a typological perspective. Building on these studies we form predictions regarding the types of predicates that potentially require coreference. Subsequently, in Section 4, we present empirical evidence, collected by corpus searches, which put these predictions to the test.

3.1 Obligatory control and no control in MSA

Examples similar to the introductory examples in (6)–(12) are found in reference grammars of MSA (Cantarino 1976; Badawi et al. 2004; Ryding 2005). Yet in none of these sources do the authors explicitly distinguish between OC and NC predicates. Nevertheless, this question is addressed from a functionalist perspective by Persson (2002) and from a generative linguistics perspective in a study by Habib (2009).

In a corpus-based study of sentential complements in MSA, Persson (2002) investigates the correlation between the semantics of the selecting predicate and the type of reference relation that holds between the subject of the ʔan clause and a matrix argument (subject or object). She distinguishes between ʔan clauses which she describes as clauses in which the embedded subject is deleted under coreference, and ʔan clauses with an overt embedded subject that does not share its reference with a matrix argument. The selecting predicates are semantically classified as manipulative, cognitive, or utterance predicates. Modality predicates are excluded from this study, since Persson assumes that they obligatorily require the complement clause subject to be coreferent with the matrix subject.8

Table 1 presents the distribution of the ʔan clauses in Persson’s corpus by predicate type and reference relation (control and no control). Manipulative predicates which select ʔan clauses prefer to appear with clauses whose unexpressed subject is coreferential with a matrix clause argument (90%) but may also appear with ʔan clauses in which the subject is independent. On the other hand, cognitive predicates (mainly desiderative, but also commentative and fearing predicates) show a strong tendency to occur with non-coreference ʔan clauses. Utterance predicates rarely appear with ʔan clauses, selecting ʔanna clause complements instead (cf. (4b)).

Table 1

Distribution of semantic types & ʔan-clause types.

Control ʔan No control ʔan Total
Manipulative (force, allow, signal,…) 120 14 134
Cognitive (see, hear, forget,…) 2 30 32
Utterance (say, claim,…) 0 1 1
Total 122 45 167

A different hypothesis with regards to control restrictions in MSA is made by Habib (2009), who claims that there is no obligatory control with ʔan clauses. As examples she gives sentences which are similar to (10a) (ambiguous with matrix subject), (12a) (ambiguous with embedded subject), and (11) (different matrix and embedded subjects). Consequently, she assumes that there are no OC predicates in MSA.

3.2 Control and no control in Modern Greek

Modern Greek (MG) is a language that shares a number of syntactic properties with MSA. Like MSA, MG is a pro-drop language. Moreover, although the unmarked word order in MG is SVO, it also allows for post-verbal subjects. The subjunctive na-clauses found in MG are remarkably similar to MSA ʔan clauses. There are two different types of clausal complements in MG: oti clauses (15a) and na clauses (15b).

    1. (15)
    1. Modern Greek
    1.  
    1. a.
    1. O
    2. the
    1. Yannis
    2. Yanis.NOM.S
    1. pistevi
    2. believes.3S
    1. [oti
    2. that
    1. to
    2. the
    1. sipiti
    2. house.NOM.S
    1. ine/itan
    2. is/was.3S
    1. oreo].
    2. beautiful
    1. ‘Yannis believes that the house is/was beautiful.’
    1.  
    1. b.
    1. O
    2. the
    1. Kostas
    2. Kostas
    1. matheni
    2. learn.3S
    1. [na
    2. PRT
    1. odhiji].
    2. drive.3S
    1. ‘Kostas is learning (how) to drive.’

The distinctions between the two types of complement clauses are reminiscent of that of ʔan and ʔanna clauses in MSA. The mood of oti-complements is always indicative, and their tense is variable. Na-complements, on the other hand, have subjunctive mood and invariable present tense. Furthermore, oti-complements can be separated from the verb by different elements; na must be adjacent to its selecting verb, with only the possibility of a negative element intervening.9

Roussou (2009) distinguishes between the verb types which select for each complement clause. Following is the set of predicate types which take na complements. Of those, the first three types can only take na-complements, whereas the rest can also appear with other types of complement clauses (e.g., oti clauses).

  1. Modals (must, can, may,…)
  2. Aspectuals (start, stop,…)
  3. Volitionals (want, desire,…)
  4. Perception (see, hear,…)
  5. Mental perception (remember, forget,…)
  6. Psych verbs (be pleased, be sorry,…)
  7. Epistemic predicates (believe, think,…)
  8. Verbs of saying (say, order,…)
  9. Verbs of knowing (know, learn,…)

Similarly to MSA, MG (and other Balkan languages) uses na-clauses in contexts where Romance and Germanic languages use the infinitive. Unlike the infinitive, the subjunctive predicate in a na-clause is a finite form, which fully agrees (in person and number) with its understood subject. Some predicates, such as matheno ‘learn’, select controlled subjunctives (C-subjunctives, or OC), which require the understood subject to be coreferential with the matrix subject (16a), while others such as thelo ‘want’ select free subjunctives (F-subjunctives, or NC) which are not controlled, and allow for both a coreferential interpretation and a non-coreferential one (16b).10

    1. (16)
    1. Modern Greek
    1.  
    1. a.
    1. O
    2. the
    1. Kostas
    2. Kostas
    1. matheni
    2. learn.3S
    1. [na
    2. PRT
    1. odhiji].
    2. drive.3S
    1. ‘Kostas is learning (how) to drive.’
    1.  
    1. b.
    1. O
    2. the
    1. Kostas
    2. Kostas
    1. theli
    2. want.3S
    1. [na
    2. PRT
    1. odhiji].
    2. drive.3S
    1. ‘Kostas wants (him) to drive.’

Roussou (2009) claims that of the predicate types listed above, modals and aspectuals enforce obligatory coreference of the subject of their na-complements. As for the volitionals, Roussou notes that the category is rather vague, and is interpreted differently by different researchers. She extends the category to encompass all “future-referring” predicates. Within this group, she finds that verbs like dare, be willing, and intend, as well as verbs of permission such as allow, encourage, and prevent, enforce coreference with a matrix argument. Roussou notes that the type of so-called volitionals that require coreference have an implicit root (dynamic) modal reading, which is associated with ability or permission (or lack thereof). Other volitionals such as want, try and manage strongly favor coreference, but also allow disjoint reference.

Roussou (2009: 1828) proposes with regard to Modern Greek that “there seems to be a continuum, which has aspectuals and then modals on the one end and volitionals (and epistemics) on the other. In between, we may find predicates like tolmo ‘dare’, prospatho ‘try’, which may be closer to one or the other end. This in-between zone can be further subject to individual speakers’ preferences, thus allowing for the existence of a ‘grey zone’, as far as control in MG is concerned”. Roussou’s proposed continuum is shown in Figure 1.

Figure 1 

The control continuum (Roussou 2009).

In comparing the class of control predicates in MG with that of English, Roussou notes that the presence of agreement marking on the embedded verb in MG, but not in English, accounts for the smaller number of control predicates in Greek.

3.3 Obligatory control and no-control predicates

The distinction between OC and NC is discussed by Landau in a series of papers (Landau 2000, 2004: et seq.). Landau distinguishes between OC, NC, and non-obligatory control (NOC). OC and NC occur in complement clauses, while NOC occurs in subject and adjunct clauses. Landau argues that OC and NOC clauses host a PRO subject, and NC clauses host a pro/DP subject. PRO in OC is interpreted as a bound variable, which is co-indexed with a co-dependent of the matrix clause. PRO in NOC is logophoric or topic-bound. He proposes that the key to distinguishing between OC and NC is found in two dimensions according to which complement clauses can be classified: semantic tense [T] and overt morphological agreement [Agr].

The tense specification of complement clauses depends on whether or not their tense is anaphoric to the tense of the matrix clause. Thus, when the complement clause is semantically tensed the matrix and embedded events can be temporally mismatched (17a), but when the complement clause is untensed they must match (17b).

(17) a.   Last night, Tom planned to help us today. → complement is [+T]
  b. *Last night, Tom managed to help us today.   → complement is [–T]

Based on this characterization, Landau categorizes the types of predicates which select tensed or untensed complement clauses.

(18) Predicates which select untensed [–T] complements
  a. Implicatives (dare, manage, remember,…)
  b. Aspectuals (start, stop,…)
  c. Modals (have, need, may,…)
  d. Evaluative adjectives (rude, silly,…)

(19) Predicates which select tensed [+T] complements
  a. Factives (glad, sad, like,…)
  b. Propositional (believe, think, claim,…)
  c. Desideratives (want, plan, prefer, hope,…)
  d. Interrogatives (wonder, ask, find out,…)

In a more recent paper, Landau (2015) argues that the [+/–T] classification stems from a deeper distinction between attitude and non-attitude contexts. The same classification remains, yet in his new theory, the predicates in the [+T] category are characterized instead as attitudinal, while those in [–T] as non-attitudinal. The temporal properties, he claims, are by-products of (non-)attitude contexts. Attitude domains are evaluated according to the epistemic state of the participant in the reported situation, and not to the actual world. Thus, the temporal mismatch in (17a) with the desiderative predicate plan is accounted for by the fact that the predicate is attitudinal: Tom can have an attitude about an event in a different time period. Conversely, the non-attitudinal implicative manage in (17b) is simultaneous with the embedded event and is therefore incompatible with temporally mismatched modifiers.11

The combination of the agreement [Agr] parameter and the semantic category of the predicate (regardless of whether it is stated in terms of tense [T] or attitude) produces four different options, which interact with control. According to Landau’s (2015) OC-NC Generalization, [+Agr] blocks control in attitude (formerly [+T]) complements, but not in non-attitude (formerly [–T]) complements Table 2.

Table 2

The OC-NC Generalization (Landau 2015).

+T/attitude –T/non-attitude
+Agr NC OC
–Agr OC OC

One implication of this generalization is that in languages such as MSA, where the complement clause exhibits overt morphological agreement, non-attitude predicates will enforce obligatory control. Indeed, as Landau (2013: 106) predicts: “There cannot be a language where modal, aspectual and implicative verbs or evaluative adjectives allow an uncontrolled complement subject”. We show in Section 4.1 that this prediction is not borne out by MSA corpus data.

3.4 Interim summary

The picture that emerges from the studies presented so far is that the distinction between OC and NC predicates is directly linked to their semantic properties. Persson (2002), Roussou (2009) and Landau (2000) all identify modality predicates as typically OC predicates. Roussou (2009) describes a continuum ranging from aspectuals and modals on the one side, to volitionals and epistemics on the other. In between she identifies a “grey zone”, which is subject to individual speakers’ preferences. Landau (2000), on the other hand, proposes a categorical bifurcation between two types of predicates, based on the semantic (in)dependence of the tense of their complement clause, as well as the manifestation of morphological agreement on its predicate. Based on these studies, we would predict that if there were OC predicates in MSA they would belong to the modals and aspectuals, and possibly the implicatives. These predictions notwithstanding, we should note that Habib (2009) denies the existence of obligatory (subject) control in MSA.

4 A corpus study of control in MSA

In order to determine whether OC predicates exist in MSA we conducted a corpus study, which focused on a number of predicates discussed by Landau (2000) and Roussou (2009). The corpus that we used to investigate the use of ʔan clauses in contemporary MSA is the 115-million token sample of arTenTen, a web-crawled Arabic corpus specifically created in order to serve as a useful tool for linguistic research (Arts et al. 2014). This sample has been tokenized, lemmatized and morphologically analyzed and disambiguated with MADA (Habash & Rambow 2005; Habash et al. 2009) and installed in the Sketch Engine (Kilgarriff et al. 2004). As mentioned above (Section 2), MSA is not a natively spoken language, and its users are all native speakers of a wide variety of dialects. Nevertheless, it is has emerged as a lingua franca of the Arab world, and the corpus we use represents a well defined language, especially when restricted to the journalistic register and domain, which is our focus in this work.

The morphological tagging of the corpus provides a way of defining queries which target particular person, number and gender features, as well as Case and mood. Consequently, we were able to retrieve instances where the matrix predicate and the embedded predicate match in their gender and person agreement, as well as those where there is a mismatch. Furthermore, we could control for the existence or lack of a possible subject (i.e., agreeing nominative noun) following the predicates. Nevertheless, the search results are not exhaustive. There are numerous instances of erroneous morphological tags, which contributed to false positive results as well as false negatives. Moreover, we decided to favor precision over recall, and limited the distance between the predicates. Consequently, instances with longer NP subjects or intervening adverbials were not retrieved. These limitations notwithstanding, in what follows we provide examples of coreference and disjoint reference for a representative set of predicates. Due to the non-exhaustive nature of the searches we do not present quantitative data with regard to the distribution of coreference and disjoint reference. We do, however, note whether we found dozens of similar examples or whether there were only several examples of a particular pattern.

4.1 Corpus findings

We conducted corpus searches using representatives from Roussou’s (2009) continuum and from Landau’s (2000) semantic categories. Following are the verbs listed in increasing order by their likelihood to enforce coreference, according to Roussou: the volitional verb ʔaraada ‘want’, the implicatives ħaawala ‘try’ and ʒaruʔa ‘dare’, the modal takmakkana ‘be able’, and the aspectual kaada ‘almost’. The corpus search revealed evidence for both coreference and disjoint reference with all these verbs, as well as with two manipulative predicates, ʔaqnaʕa ‘convince’ and samaħa ‘allow’. In what follows we present corpus-based examples of coreference and disjoint reference with each of the aforementioned predicates.

4.1.1 Volitionals

We start at the right end of Roussou’s continuum. Volitionals are predicted by Roussou (2009) and by Landau (2000) to be NC predicates. Consider the volitional ʔaraada ‘want’ in (20).

    1. (20)
    1. a.
    1. ʔaraada
    2. wanted.3SM
    1. [ʔan
    2. AN
    1. yaʕmala
    2. do.3SM.SBJ
    1. diraasat-an].
    2. study-ACC
    1. ‘He wanted to conduct a study.’
    1.  
    1. b.
    1. ʔaraada
    2. wanted.3SM
    1. [ʔan
    2. AN
    1. yakuuna
    2. be.3SM.SBJ
    1. r-radd-u
    2. the-reaction-NOM(M)
    1. watʕaniyy-an].
    2. national-ACC
    1. ‘He wanted the reaction to be national.’

In (20a) the subject of the embedded predicate corefers with the subject of the matrix predicate; the same person is both the “wanter” and the “conductor” of the study. In (20b), on the other hand, the embedded clause involves an overt subject, r-radd-u ‘the reaction’, whose reference is distinct from that of the matrix subject. Our corpus searches revealed dozens of examples of disjoint reference with the predicate ʔaraada ‘want’.

4.1.2 Implicatives

Moving left on Roussou’s continuum, we found dozens of examples of disjoint reference with the predicate ħaawala ‘try’, indicating that it is indeed an NC predicate. While the matrix and embedded verbs share a subject in (21a), in (21b) the matrix verb bears 3PM agreement whereas the embedded verb bears 3SF agreement and has an overt subject. Clearly, the two subjects do not share a reference.

    1. (21)
    1. a.
    1. ħaawala
    2. tried.3SM
    1. r-raʒul-u
    2. the-man-NOM
    1. [ʔan
    2. AN
    1. yatakallama
    2. speak.3SM.SBJ
    1. maʕa-na].
    2. with-us
    1. ‘The man tried to speak with us.’
    1.  
    1. b.
    1. liðaalika
    2. So
    1. ħaawaaluu
    2. tried.3PM
    1. [ʔan
    2. AN
    1. tanhadʕa
    2. assume.3SF.SBJ
    1. l-ʒamaahiir-u
    2. the-public-NOM
    1. bi-masʔuuliyyat-i-ha].
    2. in-responsibility-GEN-her
    1. ‘So they tried to have the public assume its responsibility.’

The implicative ‘dare’ is closer to the left end of Roussou’s (2009) continuum and is classified in Landau’s (2000) categorization as selecting for untensed complement clauses. Thus, the prediction is that it will enforce coreference, or in other words, be an OC predicate. However, as (22b) shows, this is not the case. MSA ʒaruʔa ‘dare’ allows disjointness; the verb ‘be’ in (22b) has its own overt subject, ‘her opinion’, and does not match in agreement with the matrix predicate, ‘dare’. Admittedly, the disjoint reference example presented here is the only one we were able to find with this predicate. Note, however, that ʒaruʔa ‘dare’ in itself is an infrequent verb (12.93 per million instances), with substantially fewer attestations of it followed by an ʔan clause (1.36 per million).

    1. (22)
    1. a.
    1. laa
    2. not
    1. yaʒruʔu
    2. dare.3SM
    1. raʒul-un
    2. man-NOM
    1. [ʔan
    2. AN
    1. yaquula
    2. say.3SM.SBJ
    1. l-ħaqiiqat-a
    2. the-truth-ACC
    1. fi
    2. in
    1. l-zawaaʒ-i].
    2. the-marriage-GEN
    1. ‘No man dares to say the truth in the marriage.’
    1.  
    1. b.
    1. lan
    2. never
    1. taʒruʔa
    2. dare.3SF
    1. [ʔan
    2. AN
    1. yakuuna
    2. be.3SM.SBJ
    1. raʔy-u-haa
    2. opinion-NOM-her(M)
    1. ɣayr-a
    2. not-ACC
    1. musaanid-in
    2. supportive-GEN
    1. li-lmaɣrib-i].
    2. to-Morocco-GEN
    1. ‘She will never dare that her opinion would be non-supportive of Morocco.’

4.1.3 Manipulatives

Tri-valent manipulatives do not appear in Roussou’s continuum, yet Persson (2002) identifies them as the ones which generally impose a coreference restriction. Obtaining exhaustive results with predicates from this class was even more complex than obtaining them with “subject-control” predicates. However, here too we find evidence of both types of reference relations, with several instances of disjoint reference. (23b) is a disjoint reference example of the predicate ʔaqnaʕa ‘convince’, and (24b) is a similar example with the predicate samaħa ‘allow’.

    1. (23)
    1. a.
    1. wa-fi
    2. and-in
    1. l-masaaʔ-i
    2. the-evening-GEN
    1. kaanat
    2. was.3SF
    1. malaak
    2. Malak(F)
    1. qad
    2. already
    1. ʔaqnaʕat
    2. convinced.3SF
    1. waalid-a-haa
    2. father-ACC-her
    1. [ʔan
    2. AN
    1. yaʔmura
    2. order.3SM.SBJ
    1. saaʔiq-a-hu
    2. driver-ACC-his
    1. l-xaasʕsʕ-a
    2. the-private-ACC
    1. bi-ʔiisʕaal-I
    2. in-delivering
    1. buuʒaa
    2. Buja
    1. ʔila
    2. to
    1. qaryat-i-hi].
    2. village-GEN-his
    1. ‘And in the evening, Malak had already convinced her father to order his private driver to deliver Buja to his village.’
    1.  
    1. b.
    1. ʔaqnaʕnaa-hum
    2. convinced.1P-them
    1. [ʔan
    2. AN
    1. yuʕayyina
    2. appoint.3SM.SBJ
    1. huwa
    2. he.NOM
    1. l-ħukuumat-a].
    2. the-government-ACC
    1. ‘We convinced them that he would appoint the government.’
    1. (24)
    1. a.
    1. iðaa
    2. if
    1. lam
    2. not
    1. nasmaħu
    2. allow.1P
    1. li-l-ʔameriikaan-i
    2. to-the-Americans-GEN
    1. [ʔan
    2. AN
    1. yamurruu
    2. pass.3PM.SBJ
    1. min
    2. from
    1. ʔaraaʕii
    2. territory
    1. t-turkiyya]…
    2. the-Turkish
    1. ‘If we don’t allow the Americans to pass from Turkish territory…’
    1.  
    1. b.
    1. fa-mawqiʕ-u-hu
    2. and-status-NOM-his(M)
    1. l-ʔiʒtimaaʕiyy-u
    2. the-social-NOM
    1. laa
    2. not
    1. yasmaħu
    2. allow.3SM
    1. lahu
    2. to.him
    1. [ʔan
    2. AN
    1. yakuuna
    2. be.3SM.SBJ
    1. bnu-hu
    2. son.NOM-his
    1. fii
    2. in
    1. haaða
    2. this
    1. l-makaan-i].
    2. the-place-GEN
    1. ‘And his social status does not allow him that his son will be in this place.’

4.1.4 Modals

Modals like ‘can’ are close to the left (OC) end of Roussou’s (2009) continuum and are classified as selecting for untensed complement clauses by Landau (2000). The prediction is therefore that they would enforce coreference. This prediction, however, does not hold. We found some instances of the predicate tamakkana ‘be able’ in which the embedded subject does not corefer with the matrix subject. One such cases is (25b) where the matrix subject is a pro-dropped first-person-plural subject while the embedded subject is the singular–masculine noun phrase tʕifl-un mustatasix-un ‘cloned baby’.

    1. (25)
    1. a.
    1. wa-tamakkana
    2. and-was.able.3SM
    1. ʔabuu
    2. Abu
    1. bilaal
    2. Billal
    1. [ʔan
    2. AN
    1. yanðʕura
    2. see.3SM.SBJ
    1. min
    2. from
    1. fatħat-in
    2. opening-GEN
    1. dʕayyiqat-in
    2. narrow-GEN
    1. ʒidaan].
    2. very
    1. ‘And Abu Billal was able to see from a very narrow opening.’
    1.  
    1. b.
    1. ʔiða
    2. if
    1. kaθθafna
    2. intensify.1P
    1. ʒuhuud-a-na
    2. efforts-ACC-our
    1. fa-sa-natamakkana
    2. then-will-be.able.1P
    1. min
    2. from
    1. [ʔan
    2. AN
    1. yakuuna
    2. be.3SM.SBJ
    1. laday-na
    2. with-us
    1. tʕifl-un
    2. baby.3SM-NOM
    1. mustansax-un
    2. cloned.3SM-NOM
    1. xilaala
    2. within
    1. ʕaam
    2. year
    1. aw
    2. or
    1. ʕaamayni].
    2. two.years
    1. ‘If we intensify our efforts we will be able to have a cloned baby within a year or two years.’

4.1.5 Aspectuals

Aspectuals are another class of predicates that are predicted by Roussou (2009) and Landau (2000) to enforce coreference. Most aspectual verbs in MSA do not take an ʔan-clause complement. There are, however, two approximative aspectuals which do: ʔawʃaka ‘about to’ and kaada ‘almost’.12 (26a) is a coreference corpus example of kaada ‘almost’. Importantly, (26b) is one of several disjoint-reference corpus examples of kaada ‘almost’, which indicates that it is a NC predicate, contrary to the predictions of Roussou and Landau. Our corpus searches did not reveal any results of disjoint reference with ʔawʃaka ‘about to’, so it might be the case that ʔawʃaka ‘about to’ is a raising predicate (see discussion in Section 5.4.2). Note, however, that the use of ʔawʃaka ‘about to’ is very infrequent (6.23 per million instances in general and 2.17 per million followed by an ʔan clause), and is less frequent than the use of kaada ‘almost’ (105.09 per million instances in general and 15.14 per million followed by an ʔan clause). Thus, it could be the case that ʔawʃaka ‘about to’ does allow disjoint reference but this use is extremely rare.

    1. (26)
    1. a.
    1. kaadat
    2. almost.3SF
    1. [ʔan
    2. AN
    1. tasqutʕa
    2. fall.3SF.SBJ
    1. ʕalaa
    2. on
    1. l-ʔardʕ-i].
    2. the-ground-GEN
    1. ‘She almost fell on the ground.’
    1.  
    1. b.
    1. hadamuu
    2. destroyed.3PM
    1. sanawaat
    2. years
    1. min
    2. of
    1. l-ʒihaad-i
    2. the-Jihad
    1. ħatta
    2. until
    1. kaaduu
    2. almost.3PM
    1. [ʔan
    2. AN
    1. tataħawwala
    2. turn.3SF.SBJ
    1. haðihi
    2. this
    1. t-taʒribat-u
    2. the-experiment-NOM
    1. ʔila
    2. to
    1. miʕwal-in
    2. tool-GEN
    1. haddaam-in].
    2. destruction-GEN
    1. ‘They destroyed years of the Jihad until they almost had this experiment turn into a tool of destruction.’

All but one of the ʔan-clause selecting predicates that were investigated in our corpus study turned out to be NC predicates, since instances of disjoint reference with them were attested. Importantly, we found disjoint reference examples of modals and aspectuals, which were predicted to enforce coreference.

Interestingly, in many disjoint reference examples (e.g., (22b), (24b), (25b)) a pronominal clitic appears on the embedded subject (a possessive clitic) or on the embedded verb or preposition (an object clitic) and refers back to the matrix argument. For example, in (25b) the clitic in the preposition phrase laday-na ‘with-us’ refers to the unexpressed plural first person matrix subject. This coreference creates cohesion between the two events denoted by the two clauses. Nevertheless, it is not obligatorily present (cf. (20b), (21b), (23b), and (2b)).

An additional component of Landau’s (2000) proposal is the relationship between the OC/NC distinction and semantic tense. Landau predicts that given [+Agr], as is the case with MSA ʔan-clauses, [+T] implies NC, and [–T] implies OC. The diagnostic which teases tensed complements apart from untensed ones, namely modification with temporally mismatched adverbs, is not applicable to corpus searches. Consequently, we consulted a highly competent speaker of MSA, who provided grammaticality judgments.13 We constructed test sentences by adding temporally mismatched adverbials to corpus examples. Our consultant’s judgements concurred with the [+/–T] classification, as he did not accept temporal mismatches with predicates which are identified as selecting untensed complements. Nevertheless, this was not found to correlate with the OC/NC distinction. Consider, for example, (22b), repeated here as (27a) and its modified counterpart (27b).

    1. (27)
    1. a.
    1. lan
    2. never
    1. taʒruʔa
    2. dare.3SF
    1. [ʔan
    2. AN
    1. yakuuna
    2. be.3SM.SBJ
    1. raʔy-u-haa
    2. opinion-NOM-her(M)
    1. ɣayr-a
    2. not-ACC
    1. musaanid-in
    2. supportive-GEN
    1. li-lmaɣrib-i].
    2. to-Morocco-GEN
    1. ‘She will never dare that her opinion would be non-supportive of Morocco.’
    1.  
    1. b.
    1. *maa
    2.   not
    1. ʒaraʔat
    2. dared.3SF
    1. ʔamsi
    2. yesterday
    1. [ʔan
    2. AN
    1. yakuuna
    2. be.3SM.SBJ
    1. raʔy-u-haa
    2. opinion-NOM-her(M)
    1. ɣayr-a
    2. not-ACC
    1. musaanid-in
    2. supportive-GEN
    1. li-lmaɣrib-i
    2. to-Morocco-GEN
    1. ɣadan].
    2. tomorrow
    1. Intended: ‘She didn’t dare yesterday that her opinion would be non-supportive of Morocco tomorrow.’

Although the implicative ʒaruʔa ‘dare’ allows for disjoint reference, temporal anaphoricity obtains. Thus, contrary to Landau’s (2000) generalization, the lack of independent tense (alongside the existence of overt morphological agreement) does not necessarily imply obligatory control.

4.2 Summary

The goal of this section was to investigate whether there exist ʔan-clause-taking predicates which impose OC. Building on insights from previous research on the MSA phenomenon, on a related construction in Modern Greek, and on cross-linguistic distinctions between OC and NC, we identified a class of candidate predicates. Corpus searches of the usage patterns of a representative set of predicates revealed instances of disjoint reference for all the predicates investigated, except one: ʔawʃaka ‘about to’. While it is indeed possible that this verb is in fact an OC predicate, possibly the only one in MSA, we conjecture that it is either a raising predicate or that due to its relatively low frequency the lack of instances of disjointness is coincidental. Consequently, in the following sections, where we discuss and present a formal analysis of this construction, we ignore this lacuna.

5 Discussion

5.1 Obligatory control, no control, and obviation

For languages which do have OC predicates alongside NC predicates it is natural to assume that each is associated with a distinct syntactic structure. Indeed, Landau (2004) proposes an analysis of the two types of constructions in Modern Greek, as part of his investigation of the phenomenon of finite control and its theoretical implications. These constructions, according to his analysis, differ in the types of complement clause each predicate selects. Controlled subjunctives (C-subjunctives) are clauses with PRO subjects, while free subjunctives (F-subjunctives) are clauses with pro subjects. The structural difference between the complement clauses correlates with the tensed/untensed distinction proposed by Landau; with overt agreement in the complement clause, complement clauses with PRO subjects have anaphoric (or “empty”) tense, while the tense of those with pro subjects is dependent on or constrained by the matrix tense, but not necessarily identical to it.

Roussou (2009) proposes a different account. Unlike Landau, she does not attribute the difference between the two constructions to the structure of the complement clause. On the contrary, she assumes that the na-clause is identical across the two constructions. In her account the na particle is a nominal locative element, which introduces a variable and creates an open predicate. The difference between OC and NC is in the combination of the embedding predicate and its complement. OC predicates trigger clause-union, which creates, by composition, a single-event interpretation of the events denoted by the matrix and embedded predicates (which also accounts for the tense identity). In this case, the variable introduced by na can only be bound by a matrix argument. Conversely, the NC construction does not involve clause-union and the variable has free reference.

Regardless of whether languages have OC and NC predicates, like MG, or whether they only have NC predicates, as we are claiming is the case with MSA, one issue which requires further examination is the case of coreference with an NC predicate. Consider for example the MG sentence in (16b), repeated here as (28), with its coreference interpretation. What is the syntactic structure of such sentences?

    1. (28)
    1. Modern Greek
    1. O
    2. the
    1. Kostas
    2. Kostas
    1. theli
    2. want.3S
    1. na
    2. PRT
    1. odhiji.
    2. drive.3S
    1. ‘Kostas wants to drive.’

There are (at least) three different types of answers to this issue. Habib (2009), in her analysis of MSA as having only NC predicates, considers coreference as a special case of NC, one where the freely referring unexpressed embedded subject happens to share its reference with the matrix subject. This entails that there are no syntactic reflexes to the fact that coreference does occur. Sentences such as (28) have one syntactic structure with different interpretations, dependent on semantico-pragmatic constraints.

A different approach is taken by Terzi (1992), who compares NC in MG with obviation phenomena in Romance and Slavic languages. Obviative languages impose a constraint against coreference between a matrix subject and an unexpressed subject of a subordinate clause. Thus, for example, the referent of the unexpressed embedded subject in Spanish (29) can refer to anything (3rd person-singular) but the referent of the matrix subject.

    1. (29)
    1. Spanish
    1. Juani
    2. Juan
    1. quiere
    2. wants
    1. que
    2. that
    1. venga*i/j.
    2. come.3S.SBJ
    1. ‘Juan wants that he will come.’

Terzi (1992) proposes that despite surface appearances MG does impose subject obviation, but that its presence is masked by the co-existence of an alternative structure. NC is restricted to disjoint reference, that is, the unexpressed embedded subject cannot corefer with the matrix subject. The coreference interpretation is licensed by a control construction. Thus, the combination of NC predicates with their complement clause is licensed by two distinct constructions – control and no control with obviation – depending on the reference pattern.

The proposal that MG exhibits obviation is not trivial. Obviation is associated with languages in which there is “subjunctive-infinitive rivalry”, or, in other words, where the infinitive competes with the subjunctive (Farkas 1992). Thus, in Spanish, for example, alongside the subjunctive complement clause illustrated in (29) there are infinitival clauses, which exhibit control behavior. The Balkan languages (including Romanian), however, lack infinitives. Consequently, with no subjunctive-infinitive rivalry, no obviation effects are found (Dobrovie-Sorin 2001). For this reason, MG, a Balkan language with no infinitives, is not an immediate candidate for obviation.

Additional arguments against Terzi’s (1992) analysis are given by Landau (2004), who invokes a number of diagnostics for distinguishing between pro and PRO. More specifically, he bases his argument on Varlokosta’s (1993) observation that coreference constructions with NC predicates allow a strict reading under VP-ellipsis and de re interpretation, suggesting that the unexpressed embedded subject is a pro, and not a PRO, which only allows a sloppy reading and a de se interpretation. Restricting coreference to a control configuration, as Terzi does, would not be compatible with these findings. Consequently, Landau proposes that sentences such as (28) are structurally ambiguous; they are licensed by two different constructions, a control configuration with a PRO subject and a pro-structure with accidental coreference. The existence of obviation in MG is thus ruled out.

5.2 First attempt: Only no control in MSA

The question regarding the possible analyses of NC predicates such as the MG thelo ‘want’ carries over to the analysis of ʔan complement clauses in MSA. Barring evidence for a syntactic distinction between the two interpretations, a straightforward account is to propose one NC structure for all cases. As is illustrated in the schematic representation in (30), the matrix verb combines with its subject (lexical NP or pro) and with its ʔan-clause complement. This complement clause is preceded by a complementizer/marker ʔan. The clause itself is in a VSO configuration and is headed by a subjunctive verb. Its subject is either a lexical NP or pro.

    1. (30)

Thus, regardless of how clauses, clausal complements and pro-drop in MSA are analyzed in a particular framework, sentences with ʔan complement clauses are structures with two independent subjects (but see below). There are no constraints on the agreement relations between the two predicates, and therefore they do not need to match. Consequently, what can be construed as subject control is in actuality just a case where the two subjects have identical agreement features and reference, and one of them, either the matrix subject in the backward pattern, or the embedded subject in the forward pattern (or both) is pro-dropped. This is similar in spirit to the analysis proposed by Habib (2009) for all ʔan clauses in MSA, and by Roussou (2009) for F-subjunctives in Modern Greek, which is also a pro-drop language.

The NC analysis, which builds on the pro-drop phenomenon of MSA, can account for most of the patterns exhibited by ʔan-clause-taking predicates. There is, however, one pattern that poses a challenge to this straightforward analysis – the backward pattern with the embedded subject illustrated in (12a) and repeated here as (31).

    1. (31)
    1. ħaawalai/j
    2. tried.3SM
    1. [ʔan
    2. AN
    1. yaktuba
    2. write.3SM.SBJ
    1. muħammad-uni
    2. Muhammad-NOM(M)
    1. maqaal-an].
    2. article-ACC
    1. ‘Muhammad tried to write an article.’
    2. ‘Hej tried that Muhammadi would write article.’

This simple example masks a more complex agreement pattern which is only discernable with plural human subjects, for which agreement varies depending on the position of the subject relative to the verb.

Consider the minimal pair in (32), which differ only with respect to the agreement marking on the matrix predicate. As expected in VS clauses, in both (32a) and (32b) the embedded predicate taktuba ‘write.3SF’ exhibits partial agreement with its post-verbal plural subject, l-banaat-u ‘the girls’ and accordingly bears 3SF agreement. The matrix predicate ħaawala ‘try’, on the other hand, is singular in (32a) and plural in (32b). The difference in the agreement marking on the matrix predicate correlates with a difference in the interpretation of the two sentences.

    1. (32)
    1. a.
    1. ħaawalati/j
    2. tried.3SF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-ui
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. ‘The girls tried to write an article.’
    2. ‘She tried that the girls would write an article.’
    1.  
    1. b.
    1. ħaawalna*i/j
    2. tried.3PF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-ui
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. ‘Theyj tried that the girlsi would write an article.’
    2. Not: ‘The girls tried to write an article.’

Sentence (32a) is ambiguous. The understood subject of the matrix clause can either be construed as the embedded subject or as a different singular-feminine unexpressed subject. Sentence (32b), with its plural-marked matrix predicate, can only have a disjoint reference interpretation, where the understood subject of the matrix clause is a plural–feminine referent distinct from the embedded subject. A coreference reading requires the matrix predicate to exhibit partial agreement with the embedded subject, as is the case with simple VS clauses.

The backward patterns exhibited in (32) provide counter-evidence to the NC analysis proposed above, which assigns an identical structure to coreference and disjoint reference, and attributes the distinction to semantico-pragmatic constraints. If coreference and disjoint reference share the same syntactic structure it is not clear what accounts for the absence of a coreference reading in (32b). If coreference is simply co-indexation in the semantico-pragmatic level, what bars the co-indexation between the subject of ħaawalna ‘tried’ and the subject of taktuba ‘write’? Considering the ambiguity of (32a), the absence of the coreference reading in (32b) is even more intriguing. If the two subjects can be co-indexed in (32a), why is co-indexation not possible in (32b)?

Moreover, these data are problematic for an analysis which builds on pro-drop. Assuming that predicates of pro-dropped subjects always exhibit full agreement with the unexpressed subject (see (2)), the analysis proposed above would predict (i) an unambiguous disjoint reference reading of (32a) with a singular-feminine pro-dropped subject (the second reading provided above), and (ii) an ambiguous reading of (32b), where the pro-dropped subject can either corefer with the embedded subject or refer to a different contextually retrieved subject (they). The two predictions are not borne out by the data as (i) (32a) has a coreference reading where l-banaat ‘the girls’ is construed as the subject of ħaawala ‘try’ although it exhibits singular agreement, which means that it cannot have a plural pro-dropped subject, and (ii) (32b) does not have a coreference reading where l-banaat ‘the girls’ is construed as the subject of ħaawalna ‘tried’ although it exhibits plural agreement, which means that it can have a plural pro-dropped subject.

A difference in interpretation between the forward pattern and the backward pattern is also found with F-subjunctives in Greek (Alexiadou et al. 2010). While the forward pattern in (33a) is ambiguous between coreference and disjoint reference, the backward pattern in (33b) can only have a disjoint interpretation.

    1. (33)
    1. Modern Greek (Alexiadou et al. 2010: 39)
    1.  
    1. a.
    1. O Janisi
    2. John-NOM
    1. elpizi
    2. hopes
    1. proi/j
    2. pro
    1. na
    2. subj
    1. fai
    2. eats
    1. to
    2. the
    1. tiri.
    2. cheese
    1. ‘Johni hopes that hei/j will eat the cheese.’
    1.  
    1. b.
    1. Pro*i/j
    2. pro
    1. elpizi
    2. hopes
    1. na
    2. subj
    1. fai
    2. eats
    1. o Janisi
    2. John-NOM
    1. to
    2. the
    1. tiri.
    2. cheese
    1. ‘He hopes that John will eat the cheese.’

Alexiadou et al. (2010) explain that the impossibility of coreference in (33b) is due to Principle C. The embedded referential subject, Janis, cannot be bound by the matrix pro subject. Coreference with F-subjunctives then is possible only in the forward pattern. With C-subjunctives, on the other hand, a coreference reading is the only option, and it is available in both the forward pattern and the backward pattern (34). The fact that there is no Principle C effect in this case is taken by Alexiadou et al. (2010) as evidence for a movement analysis of control.14

    1. (34)
    1. Modern Greek
    1.  
    1. a.
    1. O Janis
    2. John-NOM
    1. emathe
    2. learned.3S
    1. na
    2. subj
    1. pezi
    2. play.3S
    1. kithara.
    2. guitar
    1.  
    1. b.
    1. Emathe
    2. learned.3S
    1. na
    2. subj
    1. pezi
    2. play.3S
    1. o Janis
    2. John-NOM
    1. kithara.
    2. guitar
    1. ‘John learned to play the guitar.’

Returning to MSA, given the NC analysis we proposed in (30) above, the same explanation can be applied to account for the ungrammaticality of the coreference reading of the MSA example in (32b), namely a Principle C violation. However, unlike MG, MSA does provide a way of expressing coreference with an embedded subject (32a), yet this interpretation cannot be accounted for by the NC analysis proposed above. Thus, the backward pattern suggests that one structure cannot capture all interpretations, and that the OC/NC distinction does have syntactic reflexes. In the following section we probe deeper into the backward pattern by first conducting a corpus-based study of this construction.

5.3 Backward patterns: A corpus study

Our corpus study of the backward pattern focused on two issues: (i) the types of predicates which occur in this construction and (ii) its agreement patterns. The following examples illustrate instances of the backward pattern with ʔaraada ‘want’ (35), ħaawala ‘try’ (36), ʒaruʔa ‘dare’ (37), nasiya ‘forget’ (38), ʔistatʕaaʕa ‘be able’ (39), and tamakkana ‘be able’ (40). The embedded subjects appear in boldface.

It should be noted that in all the example sentences below the shared subject is followed by additional VP-internal material. Thus, for example, in (35) the embedded verb has two complements: one is realized as a clitic on the embedded verb (yaʒʕalu-humake.3SM.SBJ-it’) and the other, the NP ħaqiiqat-an ‘reality’ follows the subject, and is in turn followed by a PP adjunct whose scope is the embedded clause. The existence of VP-internal material after the subject constitutes evidence against an alternative extraposition analysis, which would place the subject in a matrix position to the right of the embedded clause.

5.3.1 Volitionals

    1. (35)
    1. wa-laakinna
    2. and-but
    1. haaða
    2. this
    1. maa
    2. what
    1. yuriidu
    2. want.3SM
    1. [ʔan
    2. AN
    1. yaʒʕala-hu
    2. make.3SM.SBJ-it
    1. l-baaħiθuuna
    2. the-researchers.PM.NOM
    1. ħaqiiqat-an
    2. reality-ACC
    1. bi-musaaʕadat-i
    2. with-help-GEN
    1. t-taqniyaat-i
    2. the-technologies-GEN
    1. l-ʒadiidat-i].
    2. the-new-GEN
    1. ‘But this is what the researchers want to turn into reality with the help of new technologies.’

5.3.2 Implicatives

    1. (36)
    1. bal
    2. moreover
    1. yuħaawilu
    2. try.3SM
    1. [ʔan
    2. AN
    1. yaʔxuða
    2. follow.3SM.SBJ
    1. baʕdʕ-u-hum
    2. some-NOM-of.them
    1. duuna
    2. without
    1. baʕdʕ-in
    2. some-GEN
    1. bi-t-tartiib-i].
    2. in-the-order-GEN
    1. ‘Moreover, some of them (without the others) try to follow the order.’
    1. (37)
    1. lam
    2. not
    1. yaʒruʔ
    2. dare.3SM
    1. [ʔan
    2. AN
    1. yasʕifa-hum
    2. describe.3SM.SBJ-them
    1. ʔaħad-un
    2. one-NOM
    1. bi-l-ʔirhaab-i].
    2. in-the-terror-GEN
    1. ‘No one dared to describe them as terror.’
    1. (38)
    1. la
    2. not
    1. yansa
    2. forget.3SM
    1. [ʔan
    2. AN
    1. yuʔakkida
    2. emphasize.3SM.SBJ
    1. haaʔulaaʔi
    2. those
    1. ʕala
    2. on
    1. ħirsʕ-i
    2. keenness-GEN
    1. ʔaʕdʕaaʔ-i
    2. members-GEN
    1. l-maʒlis-i
    2. the-council-GEN
    1. l-ʒudud-i
    2. the-new-GEN
    1. ʕala
    2. on
    1. tanfiiðʕ-i
    2. implementation-GEN
    1. tawʒiihaat-i
    2. directives-GEN
    1. r-raʔiis-i].
    2. the-president-GEN
    1. ‘Those (people) do not forget to emphasize the keenness of the new members of the council to implement the directives of the president.’

5.3.3 Modals

    1. (39)
    1. wa-hunaaka
    2. there
    1. tʕuruq-un
    2. ways-NOM
    1. ʔuxra
    2. other
    1. ʕadiidat-un
    2. many-NOM
    1. yastatʕiiʕu
    2. be.able.3SM
    1. [ʔan
    2. AN
    1. yasluka-ha
    2. use.3SM.SBJ-it
    1. l-muħaamuuna
    2. the-laywyers.PM.NOM
    1. li-yaxdumuu
    2. to-serve.3PM
    1. l-ʒumhuur-a].
    2. the-public-ACC
    1. ‘There are many other ways that lawyers can use to serve the public.’
    1. (40)
    1. ʔinnama
    2. only
    1. tatamakkana
    2. be.able.3SF
    1. [ʔan
    2. AND
    1. tatadaxxala
    2. intervene.3SF.SBJ
    1. muʔassasat-un
    2. institution.SF-NOM
    1. fii
    2. in
    1. hadm-i
    2. destruction-GEN
    1. l-muʔassasat-i
    2. the-institution-GEN
    1. l-ʔuxra],
    2. the-other
    1. ʔiðʕa
    2. if
    1. tuɣayyiru
    2. change.3SF
    1. l-ʔiʒtimaaʕ-a
    2. the-society-ACC
    1. taɣyyir-an
    2. change-ACC
    1. ʒaðʕriyy-an.
    2. radical-ACC
    1. ‘An institution is able to intervene in the demolition of another institution, only if it changes the society radically.’

5.3.4 Aspectuals

    1. (41)
    1. wa-binaaʔan
    2. and-based
    1. ʕala
    2. on
    1. haaða
    2. this
    1. yakaadu
    2. almost.3SM
    1. [ʔan
    2. AN
    1. yattafiqa
    2. agree.3SM.SBJ
    1. haaʔulaaʔi
    2. these
    1. l-baaħiθuuna
    2. the-researchers
    1. ʕala
    2. on
    1. ʔan
    2. that
    1. l-niðʕaam
    2. the-system
    1. l-fidiraali…]
    2. the-federal
    1. ‘And based on this, these researchers almost agree that the federal system….’

While instances of the coreferring backward pattern were retrieved for the verb categories listed above, we found that the distribution of this pattern is restricted to a particular set of predicates. Corpus searches of the backward pattern with the following ʔan-clause-taking predicates were unsuccessful: qarrara ‘decide’, xaʃa ‘fear’, rafadʕa ‘refuse’, tarradada ‘hesitate’, taħammala ‘tolerate’ and ʔiqtaraħa ‘propose’. One exception is the following example, with the verb rafadʕa ‘refuse’.

    1. (42)
    1. taðakkaruu
    2. remember.3PM.IMP
    1. ʔanna
    2. that
    1. ʔahl-a
    2. people-ACC
    1. tʕ-tʕaʔif-i
    2. Taif-GEN
    1. dʕarrabuu
    2. beat.3PM
    1. r-rasuul-a
    2. the-Prophet-ACC
    1. wa-ʔahaanuu-hu
    2. and-insulted.3PM-him
    1. wa-maʕa
    2. and-with
    1. ðaalika
    2. that
    1. rafadʕa
    2. refused.3SM
    1. [ʔan
    2. AN
    1. yuhallika-hum
    2. destroy.3SM.SBJ-them
    1. ʔalla!!!!!]
    2. Allah
    1. ‘Remember that the people of Taif beat the Prophet and insulted him yet Allah refused to destroy them!!!!!’

With respect to the correlation between the agreement marking on the matrix predicate and the reference pattern, our corpus findings conform with the generalization stated in (32). When the embedded subject is plural and human and the matrix predicate exhibits PA with it, the unexpressed matrix subject is construed as the embedded subject ((35), (36), (38), (39) & (41)). Conversely, when the matrix predicate is plural, the unexpressed subject is construed as a plural human referent, distinct from the embedded subject. This pattern is found with all ʔan-clause-taking predicates. For instance, compare example (35), where the matrix predicate ʔaraada ‘want’ exhibits PA with the plural embedded subject and reference is shared, with (43), where the matrix predicate exhibits FA with the plural embedded subject and there is disjoint reference.

    1. (43)
    1. laa
    2. no
    1. yuriiduuna
    2. want.3PM
    1. [ʔan
    2. AN
    1. yufsida
    2. spoil.3SM.SBJ
    1. l-mutaʕasʕsʕibuuna
    2. the-fanatics.PM.NOM
    1. maa
    2. what
    1. banaa-hu
    2. built.3SM-it
    1. l-masʔuuluuna
    2. the-administrators
    1. ʔila
    2. to
    1. ħadd-i
    2. limit
    1. l-ʔaan].
    2. today
    1. ‘They don’t want the fanatics to ruin what the administrators have built so far.’

Predicates which were found to be incompatible with the backward coreference pattern do appear in the FA pattern. Following are examples with taħammala ‘tolerate’ (44) and xaʃa ‘fear’ (45). In both cases, although the number–gender agreement marking on the matrix predicate is the same as the number–gender properties of the embedded subject the understood matrix subject cannot be construed as the embedded subject.

    1. (44)
    1. lam
    2. not
    1. yataħammaluu
    2. tolerate.3PM
    1. [ʔan
    2. AN
    1. yansifa
    2. ruin.3SM.SBJ
    1. l-ʒazaaʔiriyyuuna
    2. the-Algerians.PM.NOM
    1. ħulm-a
    2. dream-ACC
    1. l-misʕriyyiina
    2. the-Egyptians-GEN
    1. fi
    2. in
    1. ð-ðihaab-i
    2. the-going
    1. ʔilaa
    2. to
    1. l-muundiaal].
    2. the-FIFA World Cup
    1. ‘They did not tolerate (the fact) that the Algerians will ruin the dream of the Egyptians to go to the FIFA World Cup.’
    1. (45)
    1. wa-yaxʃawna
    2. and-fear.3PM
    1. [ʔan
    2. AN
    1. yantafiʕa
    2. benefit.3SM.SBJ
    1. l-masiiħiiyuuna
    2. the-Christians.PM.NOM
    1. l-dimuuqraatʕiiyuuna
    2. the-Democrats.PM.NOM
    1. mina
    2. from
    1. l-ʔiqtisʕaad-i
    2. the-economy-GEN
    1. l-huulandii].
    2. the-Dutch
    1. ‘They fear that the Christian Democrats will benefit from the Dutch economy.’

5.3.5 Summary

Instances of the backward pattern attested in the corpus reveal important facts with regard to its distribution and its agreement variation. As for its distribution, our findings suggest that the backward coreference pattern is limited to a set of ʔan-clause-taking predicates, which we will refer to as “backward control predicates” (BC predicates). More specifically, we found instances of backward control with volitionals, implicatives, modals and aspectuals. With these predicates sentences such as the one given in (46) are ambiguous between coreference and disjoint reference.

    1. (46)
    1. ħaawalat
    2. tried.3SF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-u
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. ‘The girls tried to write an article.’
    2. ‘She tried that the girls would write an article.’

Conversely, no attestations of the backward control construction were found with the following verbs: verbs: qarrara ‘decide’, xaʃa ‘fear’, rafadʕa ‘refuse’, tarradada ‘hesitate’, taħammala ‘tolerate’ and ʔiqtaraħa ‘propose’. With these predicates, structures such as the one illustrated with a BC predicate in (46) are unambiguous, with only a disjoint reference reading available (47). A similar situation occurs with F-subjunctives in Greek (see (33) above).

    1. (47)
    1. qararrat
    2. decided.3SF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-u
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. ‘She decided that the girls would write an article.’

The classification of verbs into BC predicates and predicates which do not allow backward control echoes Landau’s (2000) distinction between predicates which select tensed and untensed complement. With the exception of want, all BC predicates belong to the untensed ([–T]) category, while the remaining predicates are identified by Landau as belonging to the [+T] category (see Section 3.3). This observation will become relevant when we propose our analysis in Section 6.

An additional aspect, of course, is the agreement patterns exhibited by the matrix predicate in the backward pattern. The correlation between PA/FA in the matrix clause and the OC/NC distinction is supported by the corpus data. This is precisely the type of evidence that motivates an analysis which introduces a syntactic distinction between the coreference and disjoint reference interpretations. Consequently, we will assume that the NC analysis proposed in Section 5.2 (30) accounts for the forward patterns, as well as the disjoint reference (FA) backward pattern, and turn to an analysis of the backward control construction.

5.4 Backward control

In what follows we discuss two alternative approaches to the analysis of the MSA backward control construction. First, we examine whether the backward control construction can shed light on the debate regarding control theory. More specifically, we consider whether, assuming the Movement Theory of Control (Hornstein 1999), Wurmbrand & Haddad’s (2016) analysis of raising in MSA can be applied to backward control. Subsequently, in light of the differences between the two constructions and our findings regarding the distribution of the backward control construction, we propose an alternative analysis, which associates the backward coreference pattern in MSA with restructuring.

5.4.1 Overview

The phenomenon of backward control plays an important role in the debate regarding the analysis of control, which, broadly speaking, centers around two opposing approaches: the PRO-based approach (e.g., Landau 2000 and subsequent work) and the Movement Theory of Control (MTC; e.g., Hornstein 1999; Boeckx & Hornstein 2004).

The PRO-based approach to control originates in the theory of Government and Binding (Chomsky 1981: and subsequent work), in which raising and control are given distinct analyses. In both cases the complement clause is an infinitival clause with a thematic yet case-less subject position. Raising is viewed as movement of the embedded subject to receive case in the matrix subject position, while control involves a silent anaphoric pronominal PRO subject which is bound by a local c-commanding antecedent and which does not require case.

(48) a. Billi appeared [ti to leave]. (raising)
  b. Billi tried [PROi to leave]. (control)

One motivation for the PRO-based analysis of control is the theta criterion, according to which every argument must receive a unique theta role and every theta role must be assigned to a unique argument. Since subjects of control predicates appear to be interpreted in two distinct theta roles, assigned by the matrix predicate and the embedded predicate, the theory assumes that there are two syntactic arguments, the overt matrix subject and the embedded phonologically empty PRO, and each is assigned its own theta role.

Assuming the PRO-based analysis, the structure of backward control should be as illustrated in (49).

(49) PROi tried [Billi to leave].

Yet this structure is problematic for a number of reasons. First, the anaphoric PRO in the matrix position cannot be bound by the embedded subject with which it is co-indexed. Conversely, the R-expression Bill is bound by the pronominal PRO, while according to Principle C it should be free. Moreover, the embedded subject Bill appears in what is considered a case-less position. An additional related challenge is posed by languages such as MG, where the subject of OC predicates can appear either in the matrix position or in the embedded clause (see (34)). It is not clear how a DP can occur in the same position as PRO (and vice versa), where PRO is specifically defined to be incompatible with DPs.

Other approaches within the Minimalist Program argue for an alternative analysis (Hornstein 1999; Boeckx & Hornstein 2004). According to these approaches, the elimination of D-structure in the Minimalist Program made it unnecessary to maintain the theta criterion; arguments can now bear more than one theta role. Furthermore, the ban on movement into theta positions was eliminated. With these two changes in place, Hornstein (1999) argues that the PRO-based analysis of control is not necessary, and propose a movement analysis for both raising and control.

Consider the following representations of the derivation of subject control (50a) and subject raising (50b).

(50) a. [IP Bill [VPBill tried [IPBill to [VPBill leave ]]]] (control)
  b. [IP Bill [VP appeared [IPBill to [VPBill leave ]]]] (raising)

In both cases subjects originate in the embedded [Spec VP] position, where they receive a theta role. From there they first move to [Spec IP] to check the D-feature of the lower IP. At this point the derivations diverge. With control predicates the subject moves to matrix [Spec VP] to check the external theta role of the matrix verb, and then it moves to [Spec IP] to check the D-feature of the IP and its own Case. Raising predicates do not assign a theta role to the subject, so the subject skips the higher [Spec VP] position and moves directly to [Spec IP].

The MTC provides a straightforward way to account for the phenomenon of backward control. In fact, this is taken by Landau (2007: 309) to be “perhaps the most interesting contribution of the reductionist camp to the debate on the nature of OC”. The movement theory assumes a chain of subject copies, which begins in the embedded clause and ends in a matrix subject position. Although in the English examples in (50) the spelled out copies occupy the highest subject position, this is not necessarily the only option.

Polinsky & Potsdam (2002), for example, argue that the backward control construction in Tsez, a Nakh-Dagestanian language, is best analyzed by adopting the MTC. They propose that this construction is derived similarly to the English derivation illustrated in (50a) except that in Tsez backward control the subject is not only introduced but also spelled out in the embedded clause, and its subsequent movement to matrix position takes place covertly. They provide evidence for the existence of an unpronounced copy in the matrix position, which includes phenomena such as subject–verb agreement in the matrix clause and the licensing of matrix depictives and reflexives, which require a c-commanding antecedent.

Backward control in Tsez occurs only with two aspectual predicates, -oqa ‘begin’ and -iča ‘continue’, which are ambiguous between raising predicates and control predicates.15 Furthermore, the two predicates can only appear in a backward control construction, while other OC predicates in Tsez are restricted to forward control. The rarity of backward control in general, and its limited distribution in Tsez, lead Landau (2007) to question its significance as counter-evidence to the PRO-based approach.

Alexiadou et al. (2010) counter Landau’s skepticism by providing more and better evidence of backward control from Greek and Romanian. The main difference between backward control in Tsez and backward control in Greek/Romanian is that in the latter it is optional, freely alternating with forward control (see (34) above). This additional evidence suggests that the phenomenon of backward control is not rare as it is in Tsez. Similarly to Polinsky & Potsdam (2002), Alexiadou et al. (2010) propose a movement analysis for MG backward control, where copies of the subject occupy different positions in a movement chain. However, unlike the situation in the Tsez, in Greek/Romanian the subject can be spelled out either in the embedded clause or in the matrix clause.

Contrary to the attempt to unify raising and control, Alexiadou et al. (2010, 2012) propose a different analysis for MG raising constructions with embedded subjects. These constructions, they argue, are not instances of “real” backward raising. Rather, the embedded subject does not move out of the embedded clause, but engages in long-distance agreement relations with the matrix predicate.16 Nevertheless, “real” backward raising is argued by Wurmbrand & Haddad (2016) to be found in Standard Arabic. Thus, it is tempting to examine whether their analysis can be applied to the case of backward control.

5.4.2 Backward control and backward raising in MSA

Wurmbrand & Haddad (2016) (henceforth W&H) explore the different syntactic patterns in which a class of Standard Arabic predicates referred to as ʔafʕaal ʔal-muqaaraba ‘verbs of appropinquation’ occur.17 This class encompasses three semantic types: verbs of proximity, verbs of hope, and verbs of inception (Wright 2007). Following Haddad (2012), W&H classify them as raising verbs.18

W&H identify four different patterns in which verbs of appropinquation can occur. These patterns are illustrated in (51a)–(51d).19

    1. (51)
    1. a.
    1. ʔawʃakat
    2. were.about.to.3SF
    1. tʕ-tʕaalibaat-u
    2. the-students.PF-NOM
    1. [(ʔan)
    2. (AN)
    1. yanʒaħna].
    2. succeed.3PF.SBJ
    1.  
    1. b.
    1. ʔatʕ-tʕaalibaat-u
    2. the-students.PF-NOM
    1. ʔawʃakna
    2. were.about.to.3PF
    1. [(ʔan)
    2. (AN)
    1. yanʒaħna].
    2. succeed.3PF.SBJ
    1.  
    1. c.
    1. ʔawʃaka
    2. were.about.to.3SM
    1. [(ʔan)
    2. (AN)
    1. tanʒaħa
    2. succeed.3SF.SBJ
    1. tʕ-tʕaalibaat-u].
    2. the-students.PF-NOM
    1.  
    1. d.
    1. ʔawʃakat/ʔawʃakna
    2. were.about.to.3SF/were.about.to.3PF
    1. [(ʔan)
    2. (AN)
    1. tanʒaħa
    2. succeed.3SF.SBJ
    1. tʕ-tʕaalibaat-u].
    2. the-students.PF-NOM
    1. ‘The female students were about to succeed.’

The resemblance between these patterns and the ones in the focus of this paper is clear. The first two patterns are similar to what we referred to here as the “forward patterns”. Agreement between the matrix predicate and its subject depends on their relative position (FA with a pre-verbal subject and PA with a post-verbal subject) and the embedded predicate exhibits full agreement with the matrix subject. The default agreement pattern in (51c) resembles the impersonal construction illustrated in (14) with the verb waʒaba ‘have to’. The crucial pattern for our purposes is the backward pattern shown in (51d).

W&H claim that the backward pattern, where the only expressed subject is found in the embedded clause, is unique only to verbs of appropinquation, and is not found with other raising predicates or control predicates. This generalization is refuted by our corpus data, which show that the backward pattern is not unique to this particular class of predicates and, moreover, that it is compatible with various control predicates such as ħaawala ‘try’ in (52).

    1. (52)
    1. liqaaʔ
    2. meeting
    1. c-caadiq
    2. Sadiq
    1. l-mahdi
    2. al-Mahdi
    1. huwa
    2. is
    1. l-waraqat-u
    2. the-card-NOM
    1. l-ʔaxiirat-u
    2. the-last-NOM
    1. ll-atii
    2. that
    1. yuħaawilu
    2. try.3SM
    1. [ʔan
    2. AN
    1. yalʕaba-ha
    2. play.3SM.SBJ-it
    1. l-niðʕaam-u
    2. the-regime.SM-NOM
    1. l-ʔaan].
    2. now
    1. ‘The meeting with Sadiq al-Mahdi is the last card that the regime is trying to play now.’

A key difference between the backward pattern in (51d) and our backward pattern is the agreement marking on the matrix predicate. Verbs of appropinquation, according to W&H, as well as to other sources they cite (e.g., Al-Ghalayini 2003), may appear with either partial or full agreement. In fact, the FA option is the only one accepted by traditional grammar and Arab grammarians. However, the authors acknowledge that although the FA case is the one that conforms with prescriptive grammar, they were not able to find naturalistic instances of this structure in contemporary newspapers.20 They did find instances of the second pattern, where the matrix verb exhibits PA with the embedded subject. Nevertheless, the analysis that they propose assumes that the two agreement patterns are possible.

Crucially, with our predicates the FA/PA agreement alternation is manifested in the corpus only as a correlate to the OC/NC distinction described and illustrated by (32) above. The matrix verb exhibits plural marking (which can only occur with FA) when its understood subject is plural, animate and distinct from the embedded subject. Backward control is only possible when the matrix predicate exhibits PA with the embedded subject. A raising-like analysis of backward control which is based on W&H’s analysis would need to account for the difference in agreement patterns exhibited by backward raising and backward control.

5.4.3 A movement analysis of backward control

The analysis which W&H propose for the verbs of appropinquation is based on Haddad (2012) and on the notion of opacity domains (phases) and cyclic spell-out (Wurmbrand 2013; Alexiadou et al. 2014). An illustrative sketch of the analysis is given in (53).

(53) [TPSUBJ4 T [vP = PhaseSUBJ3 [TP = PhaseSUBJ2 T [vP = PhaseSUBJ1~]]]]

According to this analysis, the subject originates in SUBJ1, its base position within the lowest vP, and raises cyclically (through the different numbered position), creating a chain of copies. At PF one copy is spelled out (or pronounced). The position of the spelled out copy, as well as that of the silent copies, has morphosyntactic reflexes, namely agreement marking on the predicates involved.

The analysis captures the full symmetry which W&H assume for the verbs of appropinquation: for each forward patterns there exists a parallel backward pattern. Let us illustrate this symmetry by considering the derivations proposed for the patterns shown in ((51a), (51b) & (51d)). We will disregard the default-agreement pattern in (51c), which is not relevant for our purposes.21 The derivations are sketched in (54) (PA stands for “partial agreement”, FA for “full agreement”, DEF for “default agreement”, SUBJ for a spelled-out copy, SUBJ for a silent copy, V1 for the matrix verb, V2 for the embedded verb, and ≫ for linear precedence):

(54) a. V1PA ≫ SUBJ3 ≫ V2FASUBJ1   (51a)
  b. V1PASUBJ3 ≫ V2FA ≫ SUBJ1   (51d)
  c. SUBJ4 ≫ V1FA ≫ V2FASUBJ1   (51b)
  d. SUBJ4 ≫ V1FA ≫ V2FA ≫ SUBJ1   (51d)

Pattern (54a) sketches the derivation of the forward raising pattern exemplified by (51a). The subject raises from SUBJ1, its embedded position, where it leaves a copy, to the matrix [Spec vP] position, SUBJ3, where it is spelled out. The matrix verb moves to matrix T. The relative positions of the subject and matrix predicate account for the PA. The parallel backward pattern (54b) is derived in a similar fashion, with one difference: the spelled out copy is in the embedded position SUBJ1, and the silent copy in position SUBJ3. The PA on the matrix predicate is due to its agreement with the silent copy that follows it. The next two patterns follow the same principle. A pre-verbal matrix copy of the subject in SUBJ4 triggers FA on the matrix predicate regardless of whether it is overt, as in (54c), or covert, as in (54d).

The alternating agreement in the backward pattern is a crucial factor in Wurmbrand & Haddad’s (2016) analysis since it provides evidence for the structural effects of the position of the silent copy in the matrix clause. This, according to Polinsky & Potsdam (2006), is a necessary condition for “real” backward raising, as opposed to cases of long-distance agreement between the matrix and the embedded predicates, as is argued to occur in Modern Greek (Alexiadou et al. 2012).

With BC predicates, however, only the first three derivations in (54) are possible. The pattern in (54d), where the subject is expressed in the lower clause while the matrix predicate exhibits FA, is ungrammatical. Assuming, as this approach does, that the forward pattern is derived by the embedded subject moving to a matrix subject position, it is not clear why the two positions, SUBJ3 and SUBJ4, are available for overt subjects while only the former can host silent copies. Why is it that with raising predicates subjects can covertly raise to [Spec TP] and trigger FA but not with BC predicates? This is indeed problematic for a straightforward application of a movement analysis to control in MSA.

One way to salvage the analysis is to propose that of the four derivations licensed for the raising verbs of appropinquation, only one, namely (54b), applies to BC predicates. This would account for the only configuration that is not accounted for by the NC analysis (PA agreement implies a 3SFpro which cannot be coindexed with a plural subject). Thus, when BC predicates are used as control predicates they are obligatorily spelled out in the embedded clause and they raise covertly only to the post-verbal matrix position. This, in fact, is precisely the type of analysis which Polinsky & Potsdam (2002) propose for obligatory backward control in Tsez.

Let us consider how this applies to the sentence in (55), which is ambiguous between a disjoint reference and a coreference interpretation.

    1. (55)
    1. ħaawalat
    2. tried.3SF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-u
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1.  
    1. a.
    1. ħaawalat+proj
    1. [ʔan
    1. taktuba
    1. l-banaat-ui
    1. maqaal-an].
    1. ‘She tried that the girls would write an article.’
    1.  
    1. b.
    1. ħaawalat
    1. l-banaat-ui
    1. [ʔan
    1. taktuba
    1. l-banaat-ui
    1. maqaal-an].
    1. ‘The girls tried to write an article.’

As proposed in Section 5.2, the disjoint reference interpretation is licensed by the NC structure in (30); the embedded subject and the pro-dropped matrix subject have distinct indices (55a). The coreference interpretation in (55b), on the other hand, is an instance of backward control. This structure resembles the raising structure in (54b), and, as such, can be assumed to be derived in a similar fashion. The subject is spelled out in the embedded clause, and raises covertly to the post-verbal matrix position SUBJ3 (but, crucially, not to SUBJ4), where it triggers PA on the predicate, and, unlike raising predicates, receives a theta role. Non-BC predicates do not have this option, and consequently, a sentence similar to (55) but with a non-BC predicate would not be licensed in a movement construction and will only receive a disjoint reference interpretation (56).

    1. (56)
    1. qararrat
    2. decided.3SF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-u
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1.  
    1. a.
    1. qararrat+proj
    1. [ʔan
    1. taktuba
    1. l-banaat-ui
    1. maqaal-an].
    1. ‘She decided that the girls would write an article.’
    1.  
    1. b.
    1. qararrat
    1. l-banaat-ui
    1. [ʔan
    1. taktuba
    1. l-banaat-ui
    1. maqaal-an].
    1. ‘The girls decided to write an article.’

Although the aforementioned analysis may be technically feasible, assuming that covert movement of the subject to position SUBJ4 is prevented, it does not seem likely that a language would only exhibit backward control, a construction which is typologically rare, without allowing for the more unmarked forward control, especially when such a construction is available for raising predicates. Recall that in Tsez, most control predicates appear only in forward control, and backward control is restricted to only two aspectuals. An additional shortcoming of the proposed movement-based analysis is that it does not account for the fact that backward control is restricted to only a subset of the MSA control-like predicates, and moreover – that those predicates are the ones identified by Landau (2000) as predicates which select for untensed complements. Consequently, in what follows we will explore a different approach which builds on this characterization of BC predicates.

5.5 A restructuring analysis of backward control

5.5.1 Backward control predicates

Our corpus investigations of the backward pattern revealed that only a subset of the predicates which select ʔan clauses can appear in the backward control construction. We refer to this set as BC predicates. In what follows we consider an alternative analysis of backward control that is motivated by the observation that the distinction between BC predicates and non-BC predicates is not arbitrary. Consider the two categories presented in Table 3.

Table 3

Predicates and backward coreference.

Backward-control predicates Non-backward-control predicates
Volitionals: ʔaraada ‘want’ qarrara ‘decide’
Implicatives: ʒaruʔa ‘dare’
ħaawala ‘try’
nasiya ‘forget’
xaʃa ‘fear’
rafadʕa ‘refuse’
tarradada ‘hesitate’
Modals: ʔistatʕaaʕa ‘be able’
tamakkana ‘be able’
ʔiqtaraħa ‘propose’
Aspectuals: kaada ‘almost’

With the exception of ʔaraada ‘want’, all the verbs which appear in the left column are categorized by Landau (2000) as predicates which select untensed ([–T]) complements, or in a later formulation (Landau 2015) – non-attitudinal predicates. The right column includes predicates which select tensed [+T] complements, or attitudinal predicates (see Section 3.3).22 This categorization plays an important role in a number of syntactic phenomena across languages.

In languages with non-agreeing (infinitival) complement clauses, Landau (2000) claims, the two predicate categories coincide with the distinction between two types of obligatory control: exhaustive control (EC) and partial control (PC). In a nutshell, EC predicates impose a stricter relation between the controller and the unexpressed subject, while with PC predicates the reference of the unexpressed subject includes the controller, but is not limited to it. For example, the understood plural subject of meet in (57) can be bound by a singular subject when the matrix predicate is the attitude predicate agree, but not when it is the non-attitude condescend.23

(57) a.   Jamesi agreed [PROi+ to meet] thanks to our pressures.
  b. *Jamesi condescended [PROi+ to meet] thanks to our pressures.

Recall that in languages where the complement clause exhibits overt morphological agreement, Landau (2000) predicts that the two categories, [–T] and [+T], would be associated with OC and NC, respectively. This prediction is found to be the case in the Balkan languages. As Landau (2004) shows, predicates which select C-subjunctives belong to the [–T] category, while those which select F-subjunctives belong to the [+T] category. In MSA, however, this prediction was found not to hold. As was shown in Section 4.1, non-attitudinal ([–T]) predicates are found in no-control constructions. Nevertheless, our corpus investigation revealed that Landau’s classification does capture the distinction between BC predicates and non-BC predicates in MSA (see Table 4 for a summary).

Table 4

Correlation between tense/attitude and control types across constructions.

Agr Construction [–T]/non-attitude [+T]/attitude
+Agr Sbj. comp. in MG C-subjunctives F-subjunctives
Sbj. comp. in MSA backward control no backward control
–Agr Inf. comp. in English exhaustive control partial control

These correlations are certainly suggestive and most likely play a role in the licensing of backward control in MSA. Nevertheless, as was mentioned in Section 5.4.1, backward control constitutes a real problem for the PRO-based framework, which Landau assumes. Moreover, an attempt to adopt an alternative theory of control, namely the MTC, and to apply W&H’s movement analysis of raising in Standard Arabic to the backward control construction, resulted in a somewhat questionable ad-hoc account.

5.5.2 Restructuring and its challenges

The same predicates which we found to be compatible with backward control belong to a class of verbs identified in many languages as restructuring verbs (Wurmbrand 2001). Broadly speaking, restructuring, which is also referred to as “clause union”, “coherence”, and “complex predication”, describes a situation whereby two (or more) predicates function as a unit with respect to grammatical features such as argument structure, word order, agreement, or case. Consequently, what can be viewed as a subordinate clause does not constitute a boundary for processes which are restricted to apply within a clause. The resulting structure, then, is monoclausal.

A number of properties exhibited by MSA backward control motivate a restructuring analysis. First, as mentioned, the predicates which are licensed in this construction belong to the class of restructuring verbs. Second, under some approaches, restructuring creates a monoclausal structure which has one argument structure, and, more specifically, one subject, which is shared by the two predicates. Under such an approach the partial agreement on the matrix predicate (as well as the embedded predicate) is expected since the two predicates precede their (shared) subject. Third, there are strict adjacency conditions with respect to the linear position of the selecting predicate, ʔan, and the subjunctive, which suggest that these components form a unit. Furthermore, the agreement properties exhibited by the two predicates are identical in this construction. A final motivation for a restructuring analysis is the observation that the embedded clauses of these predicates cannot be temporally modified independently from the matrix clause (see discussion around example (27)). Having only one (semantic) tense associated with a construction is, too, compatible with restructuring as well as with a monoclausal structure.

Nevertheless, the MSA control construction does not share a number of key properties associated with the restructuring phenomena discussed by Wurmbrand (2001) mostly with regards to Germanic and Romance languages. First, a typical restructuring construction is often characterized as a matrix verb selecting as a complement a less-than-full infinitival clause (usually a VP) lacking a subject. MSA ʔan clauses are headed by subjunctive verbs, which are inflected for subject agreement and mood, and, crucially, the complement in backward control does include a subject. Moreover, different types of phenomena that are often associated with restructuring, such as clitic climbing in Italian and Spanish and long passive/long object movement in German, do not occur in MSA. This leads Habib (2009) to reject a restructuring analysis. In a similar vein, Alexiadou et al. (2010) argue that backward control in Greek and Romanian is not an instance of restructuring by showing that two separate negations as well as independent event modifiers are possible for each predicate.

While the Romance/Germanic restructuring analysis may not be adequate for MSA (and also for Greek and Romanian) other proposals, similar in spirit, exist in the literature, where a different formalization of the main idea of “clause union” or “complex predication” is applied. Herbeck (2014) and Ordóñez (2017) invoke complex predication as an alternative account of what appears to be backward control in Spanish. In their accounts DP subjects which appear to be inside embedded control infinitives actually occupy a matrix [Spec vP] position. The verbal material that precedes the subject is a verbal complex which is formed by (head or remnant) movement to higher projections. Both accounts link the occurrence of this construction with occurrence of VSO clauses in the language. In addition, as mentioned in Section 5.1, Roussou (2009) proposes a clause-union analysis for OC in Modern Greek. Under her analysis, the complements of OC predicates lack semantic tense, and for this reason clause union is triggered, which in turn requires the variable introduced by the complement to be bound by a matrix argument. One shortcoming of her proposal is that it is not clear how it could handle backward control, where the subject is located in the embedded clause and is not bound by an overt matrix argument.

Grano (2015) takes Roussou’s (2009) restructuring account of OC further and proposes a hybrid raising/restructuring approach. Following Landau’s (2000) observation that the PC/EC distinction is tied to the tensed/untensed distinction, Grano (2015) proposes that PC involves a bi-clausal structure with an embedded (possibly partially) bound PRO subject. EC, on the other hand, has a monoclausal structure; the control predicate realizes a functional head and the complement is a vP projection (rather than a clausal CP complement).24 In lieu of Roussou’s (2009) bound variable solution, Grano proposes that the subject in EC constructions is base-generated in the embedded clause and raises to a matrix subject position.

Essentially, Grano’s (2015) hybrid approach combines the PRO-based analysis of control for PC with a movement analysis of control for EC. However, Grano’s movement analysis is different from the movement analysis that we considered in the previous section in that it builds on the semantic distinction between PC and EC predicates and restricts movement to a restructuring configuration which is licensed only with restructuring verbs. Moreover, Grano argues that the distinct analyses which he proposes for PC and EC are “in harmony” with the control patterns of Greek subjunctives. The same predicates which allow PC in languages such as English are those which select F-subjunctives in Greek. EC predicates, on the other hand, are those which in Greek select C-subjunctives. The association of C-subjunctives with restructuring and movement makes it possible to account for backward control, which is found only with C-subjunctives. Thus, the advantage of the MTC approach with regards to accounting for backward control is exploited, but in a principled fashion.

The similarity between MG and MSA extends beyond the shared typological properties that we first noted in Section 3.2. The bifurcation of the MG predicates into C-subjunctives and F-subjunctives mirrors the classification of BC predicates and non-BC predicates in MSA. Moreover, the constraint against backward coreference with F-subjunctives is also found in MSA when the matrix predicate exhibits FA (see discussion around examples (33) & (32)). This phenomenon lends support to the no-control analysis which is proposed for both languages. In addition, in the backward pattern, control is obligatory in MG with C-subjunctives and is possible in MSA only with BC predicates exhibiting PA. In what follows we build on these similarities and propose an account that is inspired by Grano’s (2015)’s hybrid approach and his analysis of C-subjunctives in MG, which combines restructuring and subject raising.

6 The analysis

6.1 Overview

Most aspects of the embedded ʔan clause construction are perfectly regular and predictable from the grammar of MSA. Subject–verb agreement between overt matrix subjects and the matrix verb is as expected: FA with pre-verbal subjects and PA with post-verbal subjects. The agreement marking on the embedded verb is subject to purely local constraints: regardless of the reference relationship the embedded subject is engaged in, the verb exhibits PA when the subject is expressed and FA when it is not. Only one component of this construction is puzzling: when the matrix subject is not expressed, the matrix verb does not exhibit invariable full agreement.

The challenge, then, is to explain the puzzle posed by the backward control construction illustrated in (32) above. More concretely, the questions that we must answer are (i) why is the configuration in (58a) ungrammatical with control predicates and (ii) what licenses the configuration in (58b).

(58) a. *VFAi [AN VPA NPi[NOM]]
  b.   VPAi [AN VPA NPi[NOM]]

Following the insights of traditional Arab grammarians, we assume that fully inflected verbs in MSA are verbs whose subject requirement is fulfilled by an incorporated pronominal, or, in other words, whose subject is pro-dropped. Consequently, a plural or dual verb necessarily has a pro-dropped subject. Singular verbs are ambiguous between fully inflected forms with singular incorporated pronominal subjects, and partially inflected forms, which are the ones that appear preceding a lexical subject in an unmarked simple clause. Under the assumption that fully inflected forms indicate a pro-dropped subject, the ungrammaticality of (58a) can be explained by appealing to Principle C; the embedded subject, as a referring expression cannot be bound by the phonologically empty pro. A similar account is given by Alexiadou et al. (2010) to the parallel Greek construction.25

As for the configuration in (58b), we propose that the partial agreement on the matrix verb in the backward pattern indicates that the subject of this verb is perceived to be the embedded subject, which follows it. This entails that the backward control construction involves a single subject which is shared by the two predicates and consequently realized only once in the embedded clause. We propose that this is achieved by restructuring: the matrix predicate and the embedded predicate form a complex predicate which “inherits” the argument structure of the embedded predicate.

6.2 A possible formalization

Habib (2009), in her study of ʔan complement clauses in MSA, rejects restructuring as a possible analysis on the grounds that they do not exhibit long object movement and the embedded verb can assign accusative case. Adopting Wurmbrand’s (2001) typology, Habib proposes that ʔan complement clauses are reduced non-restructuring clauses (i.e., TPs, or “something between VP and CP”) as opposed to ʔanna complement clauses, which she argues are full non-restructuring clauses (CPs). There are, however, more inclusive conceptualizations of restructuring (e.g., Roussou 2009; Grano 2015) according to which a TP or a vP can restructure. Moreover, in a more recent paper Wurmbrand (2015) proposes that restructuring complements are larger than VP and include (at least) a voice projection. We will adopt these views and show how Habib’s proposal can be adapted to account for our findings regarding backward control.

Our point of departure is Habib’s (2009) NC analysis. As an example, consider the derivation of (59a) schematized in (59b) (Habib’s (1)). ʔan under her analysis is not a complementizer (like ʔanna) but rather a functional element which selects for a verb in the subjunctive mood. It resides in the head T position of a TP. The embedded subjunctive verb raises to v and incorporates with it, and the whole complex raises and incorporates with ʔan in T.

    1. (59)
    1. a.
    1. yuriidu
    2. wants.3SM.IND
    1. [ʔan
    2. AN
    1. yaʔkula
    2. eat.3SM.SBJ
    1. raami
    2. Rami
    1. t-tufaħat-a].
    2. the-apple-ACC
    1. ‘He wants Rami to eat the apple.’
    1.  
    1. b.
    1. [VP [V’ yuriidu [TP [T ʔan+yaʔkulai [vP raami [V’ ti [VP [V’ ti [DP t-tufaħat-a ]]]]]]]]]

Let us continue by considering the minimal pair in (32), which illustrates the FA and PA variants of the backward pattern. First we focus on the FA version repeated here as (60), with its one possible disjoint interpretation.

    1. (60)
    1. ħaawalna
    2. tried.3PF
    1. ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-u
    2. the-girls-NOM
    1. maqaal-an.
    2. article-ACC
    1.  
    1. a.
    2.  
    1. ħaawalna+proj
    2. tried.3PF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-ui
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. ‘Theyj tried that the girlsi would write an article.’
    1.  
    1. b.
    1. *ħaawalna+proi
    2.   tried.3PF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-ui
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. Intended: ‘The girls tried to write an article.’

The derivation in (59b) sketches the analysis at the matrix VP level, where the matrix verb occupies head position. In the syntactic tree presented in (61) we extend the derivation above this level to a full (TP) clause and illustrate how it applies to the disjoint reading in (60a). To derive the verb-initial order of the matrix clause the verb raises from matrix V to T, and the pro subject occupies the [Spec TP] position (see Aoun et al. 2010: among others). With regards to reference, the referent of the matrix pro subject is necessarily distinct from that of the embedded subject l-banaat ‘the girls’ due to Principle C. Note that the raising counterpart of (60) is grammatical since [Spec TP] in this construction is occupied by a covert copy of the overt embedded subject, and not a binding pro.

    1. (61)

We now turn to the second backward pattern, namely the one in which the matrix predicate exhibits PA with the embedded subject. The PA version of (32) is repeated here as (62).

    1. (62)
    1. ħaawalat
    2. tried.3SF
    1. ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-u
    2. the-girls-NOM
    1. maqaal-an.
    2. article-ACC
    1.  
    1. a.
    1. ħaawalat+proj
    2. tried.3SF
    1. [ʔan
    2. AN
    1. taktuba
    2. write.3SF.SBJ
    1. l-banaat-ui
    2. the-girls-NOM
    1. maqaal-an].
    2. article-ACC
    1. ‘Shej tried that the girlsi would write an article.’                                        (no control)
    1.  
    1. b.
    1. {ħaawalati
    2. tried.3SF
    1. ʔan
    2. AN
    1. taktubai}
    2. write.3SF.SBJ
    1. l-banaat-ui
    2. the-girls-NOM
    1. maqaal-an.
    2. article-ACC
    1. ‘The girls tried to write an article.’                                                        (restructuring)

The disjoint reading in (62a) is associated with a 3SFpro-dropped matrix subject, which cannot be coindexed with the plural feminine embedded subject due to number mismatch. This interpretation is licensed by a structure that is similar to the one shown in (61), the only difference being that in the case of (62a) the matrix predicate and its associated pro are singular, rather than plural.

The analysis diverges from the one proposed by Habib (2009) when the coreference reading in (62b) is considered. This backward control configuration, we suggest, is licensed by restructuring. The matrix predicate and the embedded predicate form a complex predicate (indicated in (62b) by curly brackets) which renders the structure monoclausal, with the embedded subject acting as the sole subject of the complex predicate. We propose that this option is restricted to BC predicates, which, as mentioned, belong to the class of restructuring verbs in various languages. The property which sets them apart from other control-like predicates is that they can “attract” the verbal complex in the embedded T, thus forming a verbal complex which “inherits” the argument structure from the embedded predicate.

One way to formalize this derivation in Habib’s (2009) system is illustrated by the syntactic tree in (63). The first step in the derivation is identical to the one illustrated in (61): the embedded subjunctive moves from V to incorporate with v, and then to T, where it incorporates with ʔan. The next step is optional and restricted to BC predicates. The ʔan+subjunctive cluster moves further to the matrix V position, where it incorporates with the BC predicate to form a complex predicate. At this point the complex predicate is a single syntactic unit, made up of a finite verb and a subjunctive verb, both marked with 3SF agreement, and the functional element ʔan.

    1. (63)

With restructuring in place, the derivation continues (see the tree in (64)). The complex predicate raises to the matrix T position, similarly to the V-to-T movement assumed in the NC construction. Then, in the spirit of Grano’s (2015) raising/restructuring approach, the embedded subject raises to the matrix [Spec vP] position, where it is assigned an external theta role and where it is spelled out. The full derivation of the backward control structure in (62b) is presented in (64).

    1. (64)

The question that we need to consider now is how to analyze the forward coreference pattern, where we found no evidence of syntactic reflexes of the OC/NC distinction. Recall that in Section 5.1 three alternative answers were discussed: (i) no control, (ii) obviation, and (iii) ambiguity (control/no control).

Let us begin with obviation. Terzi (1992) proposed for MG that the NC analysis is subject to obviation, and therefore only disjoint reference is allowed. A coreference interpretation, where the matrix subject and the understood embedded subject share reference, is licensed only by a control structure. Landau (2004) argues against this proposal by presenting syntactic diagnostics that suggest that the embedded subject can be a pro, thus ruling out a control-only analysis of coreference.

Evidence against an obviation analysis for MSA is found in corpus examples such as the one in (65), where an embedded pronominal subject huwa ‘he’ shares its reference with a lexical matrix subject l-ʔadiib ‘the writer’.26

    1. (65)
    1. ʔinna
    2. that
    1. l-ʔadiib-a
    2. the-writer.SM-ACC
    1. laa
    2. not
    1. yastatʕiiʕu
    2. be.able.3SM
    1. [ʔan
    2. AN
    1. yuqarrira
    2. decide.3SM.SBJ
    1. huwa
    2. he.NOM
    1. bi-nafsihi
    2. by-himself
    1. ʔanna
    2. that
    1. n-nasʕsʕ-a
    2. the-text
    1. sa-yakuuna
    2. will-be.3SM
    1. fii-hi].
    2. in-it
    1. ‘The writer cannot decide by himself that the text would be in it.’

Although the selecting predicate ʔisttʕaaʕa ‘be able’ is a BC predicate, this sentence cannot be licensed by our proposed restructuring construction since it involves two overt subjects. Thus, the obviation option must be rejected and accordingly, the no-control analysis must apply to the coreference pattern.

With obviation ruled out, the two remaining options are either to restrict restructuring to backward control, or to assume that the forward coreference pattern is syntactically ambiguous between no control and restructuring. In what follows we tentatively assume the latter, and show how a raising/restructuring analysis of the subject-initial forward pattern emerges naturally from our proposed analysis of backward control.27

Restructuring in effect takes a number of predicates and forms one syntactic unit which can function similarly to a simple V in a VSO clause. This is the case in (64), where instead of a simple V, a complex {VV} occupies the T position, while its subject is found in its post-verbal subject position. If so, we can assume that complex predicates can also appear in SVO clauses. There is, however, no consensus regarding the analysis of SVO clauses (see Aoun et al. 2010 for discussion). We will assume the analysis mentioned by Habib (2009), according to which SVO clauses are derived by the subject raising from the matrix [Spec vP] to [Spec TP] and triggering full agreement on the predicate in a Spec-Head relation.28 Consequently, we suggest that a complex predicate, undergoing a similar derivation, would surface with FA marking on both of its verbal components.

To illustrate this proposal let us return to our example sentence in its forward pattern variant. If we assume that restructuring is found in both backward and forward control, the coreference interpretation of (66) is syntactically ambiguous. It is licensed by the no-control structure (66a), which is available for all ʔan-clause selecting predicates, and by restructuring (66b), which is licensed only with BC predicates.

    1. (66)
    1. ʔal-banaat-u
    2. the-girls-NOM
    1. ħaawalna
    2. tried.3PF
    1. ʔan
    2. AN
    1. yaktubna
    2. write.3PF.SBJ
    1. maqaal-an.
    2. article-ACC
    1. ‘The girls tried to write an article.’
    1.  
    1. a.
    1. ʔal-banaat-u ħaawalna [ʔan yaktubna+pro maqaal-an]                             (no control)
    1.  
    1. b.
    1. ʔal-banaat-u {ħaawalna ʔan yaktubna} maqaal-ana                              (restructuring)

This is not the case with the alternative forward pattern illustrated by (67a), where the matrix clause is verb-initial. The two predicates have mismatched agreement and the subject intervenes between them, thus indicating that no restructuring took place. Consequently, this pattern can only be licensed by the NC structure (67b).

    1. (67)
    1. a.
    1. ħaawalat
    2. tried.3SM
    1. l-banaat-u
    2. the-girls-NOM
    1. ʔan
    2. AN
    1. yaktubna
    2. write.3PF.SBJ
    1. maqaal-an].
    2. article-ACC
    1. ‘The girls tried to write an article.’
    2. ‘The girls tried that they would write an article.’
    1.  
    1. b.
    1. ħaawalat l-banaat-ui [ʔan yaktubna+proi/j maqaal-an]                             (no control)

To summarize, the starting point of this proposed formalization is Habib’s (2009) no-control analysis of MSA ʔan clauses, including her assumptions regarding the derivation of VSO and SVO clauses. The one phenomenon which her analysis is missing, namely backward control, is accounted for by the introduction of an additional mechanism, which is available only to a particular subset of predicates, namely BC predicates. BC predicates can “attract” the heads of their selected complement clauses to raise and incorporate with them to form a complex predicate. This complex predicate, then, functions similarly to a simple predicate; in VSO clauses it exhibits partial agreement with its subject, while in SVO clauses there is full subject–verb agreement.

6.3 Open issues

The proposed analysis accounts for all the different reference and agreement patterns exhibited by MSA ʔan clauses. Nevertheless, it introduces a number of open issues which need to be addressed. First, as discussed in Section 5.5.2, although MSA does not exhibit the prototypical properties of (Romance/Germanic) restructuring, such an analysis is motivated by a number of properties which do suggest restructuring. More empirical evidence regarding the monoclausal structure of backward control (and possibly forward control) would further support (or weaken) our restructuring proposal. Grano (2015), for example, employs diagnostics involving inverse scope, the licensing of negative polarity items and the interpretation of antecedent-contained deletion (ACD) to distinguish between restructuring and non-restructuring constructions. Similarly, Polinsky & Potsdam (2002) provide evidence for covert subject raising in Tsez by examining whether constructions which require c-commanding antecedents are licensed when the subject is spelled out in the embedded clause. Such diagnostics, when applicable to MSA, can be used to test our hypothesis regarding backward control and also answer the question regarding the optional occurrence of restructuring in the forward pattern.

Our analysis made use of incorporation as a mechanism for deriving restructuring. Nevertheless in our proposed formalization we appealed to the notion of incorporation at the conceptual level, without fleshing out the mechanism behind the process. A similar approach to restructuring, yet fully couched in a theoretical framework, is proposed independently by Wurmbrand (2015). This approach, Wurmbrand argues, combines the insights of the complex head approaches, to which our proposal belongs, with the advantages of the competing VP-complementation approaches. More specifically, in this approach, which extends to a variety of typologically distinct languages, restructuring is formalized as the incorporation of the embedded v into the matrix V, with the embedded V remaining in the complement clause. Although the two approaches are similar in spirit our analysis is not straightforwardly adaptable to Wurmbrand’s (2015) system. The two predicates in MSA do seem to form one inseparable syntactic unit (or complex head). Corpus searches did not reveal instances of the backward coreference pattern with material intervening between the matrix predicate and ʔan. Consequently, we proposed that their functioning as one unit accounts for the agreement patterns which they exhibit. This is not immediately transferable to Wurmbrand’s (2015) system, where incorporation is of a more abstract nature.

Finally, the proposed analysis focused only on the reference relationships between the matrix subject and the embedded subject. However, as was mentioned regarding (6b) and illustrated in (23) and (24), control and no control are also found between embedded subjects and matrix objects, albeit only with a forward pattern, where the missing argument is in the embedded clause. Consequently, the same question can be asked in this context: is there a syntactic distinction between the two interpretations? The type of argumentation that was used here to support the proposed analysis does not apply in this case, since the matrix object does not trigger agreement on the matrix predicate. We have not found other evidence for a syntactic distinction. These issues remain open for future work.

7 Conclusion

So, is control a part of the grammar of Modern Standard Arabic? The search for predicates which enforce coreference between the subject of an embedded subjunctive clause and a matrix argument was unsuccessful. A corpus investigation of likely candidates retrieved instances of disjoint reference for all candidates (with one exception, but see discussion in Section 4.1). These findings contradict generalizations and predictions regarding the correlation between semantic tense, agreement and control. In a sufficiently large corpus, even modals and aspectual predicates, which were predicted to exhibit control behavior, were found to allow for free reference (or no control).

Nevertheless, although no obligatory control predicates were found in MSA, the backward pattern, where the single expressed subject occurs in the embedded clause, revealed morphosyntactic reflexes of the control vs. no control distinction. Furthermore, coreference between the expressed embedded subject and the unexpressed matrix subject was found to be restricted to a set of predicates which we referred to here as backward control (BC) predicates. Thus, we concluded that a single no-control structure can capture all the patterns and interpretations attested except one: backward control.

The phenomenon of backward control plays an important role in the debate regarding the analysis of control, which, broadly speaking, centers around two opposing approaches: the PRO-based approach and the Movement Theory of Control (MTC). While backward control is especially challenging for the PRO-based approach, the MTC provides a straightforward way to account for it. Assuming the MTC, according to which control and raising are derived in similar fashions, we first attempted to adapt Wurmbrand & Haddad’s (2016) movement analysis of forward and backward raising in MSA to backward control. The difference between backward control and backward raising with regards to the agreement exhibited by the matrix predicate rendered a unified analysis of both constructions somewhat questionable and ad-hoc.

A different approach emanated from the similarity between BC predicates in MSA, obligatory-control predicates in Modern Greek, and restructuring verbs in Romance and Germanic languages. More specifically, we proposed that BC predicates in MSA can optionally restructure with the embedded subjunctive and form a complex predicate which denotes a single event and has one argument structure. With one argument structure the single subject is construed as the subject of both predicates, thus giving rise to the control (or coreference) interpretation and accounting for the agreement marking on the matrix predicate. We sketched a formalization of this analysis, avoiding as much as possible theory-specific notions and details. Our analysis, therefore, sheds new light not only on the specific MSA constructions we focused on, but on more fundamental questions of control, raising, and restructuring in natural languages.