Intervention effects in clefts: a study in quantitative computational syntax

Clefts are understood as biclausal structures involving the movement of a clefted constituent from a lower clause, where it is generated, to a higher clause, where it is interpreted. Though both grammatical, subject and object clefts show signs of different acceptability in experimental settings. This degradation is ascribed to the fact that the object needs to cross an intervening subject, thus triggering intervention effects. In this paper, we show that intervention effects are also present in grammatical configurations, and give rise to lower-than-expected frequencies. Based on sets of features that play a role in the syntactic computation of locality, we compare the theoretically expected and the actually observed counts of features in a corpus of thirteen syntactically annotated treebanks for three languages (English, French, Italian). We find the quantitative effects predicted by the theory of intervention locality: object clefts are less frequent than expected in intervention configuration, while subject clefts are roughly as frequent as expected. We also find that the size of the effect is proportional to the number of features that give rise to the intervention effect. These results provide a three-fold contribution. First, they extend the empirical evidence in favour of the feature-based intervention theory of locality. Second, they provide theory-driven quantitative evidence, thus extending in a novel way the sources of evidence used to adjudicate theories. Finally, the paper provides a blueprint for future theory-driven quantitative investigations.


Introduction
Studies in comparative syntax have shown that natural languages vary in the strategies adopted in conveying discourse properties (Rizzi 1997;Endo 2007;Abels 2012;Shlonsky 2015).
A strategy is the displacement of constituents bearing relevant informational properties (since Chomsky 1977), such as interrogative or focalized elements. This type of displacement is widely adopted among natural languages, and well documented in Romance or Germanic (Ross 1967;Cinque 1990;Cecchetto 2000;Ott 2014). 1 We refer to displacement of syntactic elements as a phenomenon involving a dependency relation whereby a constituent is interpreted simultaneously in two different positions (Belletti 2018). Let us consider the Italian declarative in example (1), in which lo studente 'the student' and il libro 'the book' are generated respectively to the left and to the right of a lexical verb and interpreted as the subject and the object of the verb leggere 'to read', representative of the canonical SVO (subject, verb, object) word order of Italian (Dryer & Haspelmath 2013). (1) Lo studente ha letto il libro. the student.m.sg has read the book.3.m.sg 'The student has read the book' If one constituent must be displaced for discourse reasons, the structure and/or the word order may change. In Italian, for example, relative clauses (Alexiadou et al. 2000;Andrews 2007) show the relevant relativized element (the subject in (2)a; the object in (2)b) located at the beginning of the syntactic structure, followed by a functional element, a complementizer, che 'that'. (2) Relative Clauses a. Lo studente che < lo studente > ha letto il libro. the student.m.sg that has read the book.m.sg 'the student that read the book' b. Il libro che lo studente ha letto < il libro >. the book.m.sg that the student.m.sg has read. 'The book that the the student has read' The relativized element moves from its locus of generation, marked in angle brackets in the examples (2), to a landing site higher than the complementizer che 'that'. The displaced elements in (2) are now interpreted in two positions: the position in which the constituents are phonetically 1 In many natural languages, the displacement co-occurs with a specialized overt particle/marker, for example in Gungbe (Aboh 2004), Abidji (Hager-M'Boua 2014), Japanese (Saito 2012). Theoretical accounts of the same nature have also been provided for in-situ components, such as interrogative elements in Chinese (Huang 1982;Cheng & Rooryck 2000), where the marked syntactic components do not seem to "superficially" move (Bonan 2017). overt and the "copies" of their argumental structure. The dependency created by the source position and the landing site has been object of study by a wealth of literature of parsing (Rizzi 1990;Gibson 1998;Gibson & Warren 2004;Lewis & Vasishth 2005;Friedmann et al. 2009;Villata et al. 2016 inter alia).
The current study addresses a particular type of syntactic reordering discussed in the literature under the label of clefts (Jespersen 1937;Lambrecht 2001;Haegeman et al. 2015;Belletti 2015).
Similar to relative clauses, clefts involve the movement of a constituent to the very beginning of the structure preceding the complementizer che 'that'. What distinguishes clefts from relative clauses is the presence of a copula in front of the displaced constituent and the semantics conveyed by the dislocated element, which expresses, among other properties, focalisation, a further dimension of semantics and discourse of a syntactic constituent (Kiss 1992;Rizzi 1997;Beck 2006;Bianchi et al. 2015). Another difference between clefts and relative clauses is that clefts are propositions and relative clauses are not. Examples from Italian are provided in (3). (3) Cleft clauses a. È lo studente che < lo studente > ha letto il libro. Is the student.m.sg that has read the book.m.sg 'It is the student that has read the book' b. È il libro che lo student ha letto < il libro >. Is the book.m.sg that the student.m.sg has read. 'It is the book that the student has read' Cleft structures are single propositions with biclausal syntax (that is, they exhibit two inflected verbs), containing a syntactic gap linked to the clefted element in a long-distance dependency. 2 In (3), the clefted element has been displaced to a higher clause, preceded by a copula element (and its impersonal subject when required, e.g. French, English) and followed by a complementizer.
Different types of syntactic constituents can be clefted in these structures, such as subjects (e.g. lo studente 'the student' as in ((3)a)), objects (as il libro 'the book' as ((3)b)), and non-core elements/ adjuncts (e.g. temporal and locatives items such as, for example, in biblioteca 'in the library' or l'anno scorso 'last year'). We will refer to them as subject, object and adjunct/adverbial clefts throughout this paper. From a syntactic point of view, clefts, together with relatives, belong to the set of structures involving long-distance dependency relations. The cross-linguistic and intralinguistic availability and acceptability of these structures depends on many factors (Rizzi 1990(Rizzi , 2013Gibson 1998). We study here the predictions of an intervention theory of locality (Rizzi 2004;Friedmann et al. 2009).
In a nutshell, the intervention theory of locality says that a long-distance dependency between two elements in a sentence is difficult, and sometimes impossible, if a similar element structurally intervenes between the base-generation position and the landing site. For example, in object-oriented relative/cleft clauses, as in ((2)b) and ((3)b), the relativized/clefted DP object il libro 'the book' crosses a similar element (an intervener) in syntactic terms, namely the DP subject lo studente 'the student' generated higher, while in subject relative/cleft clauses, as in ( (2) a) and ((3)a), the relativized/clefted subject does not cross any relevant intervener.
According to the intervention theory of locality, the crucial property is not the amount of material that can be considered as intervening, but rather its quality. Ungrammatical structures or slower parsing effects arise, if the ultimate landing site and the intervener share relevant features relevant for locality (see Rizzi 1990 for a first theoretical account). Intervention does not block the movement in adult grammar in cleft/relative structures, but these long-distance relations are difficult for specific populations of speakers, such as early grammars and language pathology (Rizzi 1990;Grillo 2008;Friedmann et al. 2009;Martini et al. 2020). And even in adult grammar, despite their grammaticality, previous studies show that quantitative differences can be observed in structures with or without intervention (Samo & Merlo 2019).
In this work, we aim to add a quantitative dimension to the established qualitative descriptions of intervention effects in cleft structures, adopting the tools of Quantitative Computational Syntax (Merlo 2016, Gulordava & Merlo 2020. Using large-scale resources and simple computational models, we verify the quantitative predictions of linguistic proposals. This methodology assumes that underlying grammatical properties surface quantitatively, once independent influences of use are properly factored out. The guiding general hypothesis is that the specific properties triggering ungrammaticality in a given environment will still be disfavoured when a structure is grammatical in other environments (Bresnan et al. 2001).
We investigate three languages in which cleft structures naturally occur with different productivity (French, Italian and English) in treebanks syntactically annotated under the guidelines of Universal Dependencies (Zeman et al. 2020). We observe variation in the quantitative properties of clefts. This result provides further insights into the nature of grammatical structures extracted from corpora, and the principles that govern preferences and dispreference of specific subtypes of reorderings of constituents.
Our methodology requires comparing the distribution of the features under investigation in those contexts where no intervention effects occur to the distributions of the same features in those contexts where intervention effects occur. Plausible no intervention contexts are "canonical" orders, as will be discussed in detail in sections 3 and 4. In these sections, we present the formal steps to quantify intervention effects and elaborate predictive counts. Section 5 will present the evidence drawn from corpora to be compared with the expected results. Section 6 shall present a further puzzle given by non-argumental clefts. Finally, sections 7 and 8 discuss and conclude.

Cleft structures and asymmetries between subject and non-subject
A wealth of literature has investigated the nature of cleft structures and many descriptions, typologies and interpretations of their properties have been proposed (Prince 1978;Meinunger 1998;Roggia 2008;Dufter 2009;Reeve 2012;Frascarelli & Ramaglia 2013;Haegeman et al. 2015;Belletti 2015;Karssenberg & Lahousse 2018;De Cesare & Garassino 2018;Chesi & Canal 2019). In this work, we adopt the label 'clefts' to investigate the it-clefts shown in example (3). A large body of literature assumes that the clefted constituent in cleft structure has been extracted from a lower clause (Kiss 1998), via A'-movement (but see Doetjes et al. 2004 for discussion), and may encode different properties of information-structures such as a typology of foci (Belletti 2015) or (given) topics (Doetjes et al. 2004;Karssenberg 2017;Karssenberg & Lahousse 2018). Cross-linguistic studies have shown that natural languages allowing clefts vary in the frequency of usage of these structure (Roggia 2008;Dufter 2009). For example, clefts represent a preferred answer strategy in French for questions on a specific constituent (Belletti 2010). Clefts also represent interesting loci of micro-variation: it has been shown that Swiss German speakers produce a higher frequency of cleft structures than German speakers from Germany, in specific registers (Stark 2014).
Cleft sentences exhibit a well-documented asymmetry between subject and non-subject clefts.
Experimental studies have shown that there is a tendency for subject clefts to be easier to parse than object clefts and many grammatical accounts have proposed that these two types of clefts have slightly different syntactic and discourse properties (Bever 1974;Dick et al. 2004;Lobo et al. 2019;Chesi & Canal 2019 and reference therein).
We discuss a simplified theory, compatible with several analyses, where no distinctions are made between focus and topic positions (see, however, Belletti 2015 for a detailed cartographic derivation of cleft structures). This necessary simplification is due to the nature of our data, as we cannot account for any discourse features (i.e. topics, foci) of the naturally occurring examples extracted from syntactically annotated treebanks. We quantify asymmetries in grammatical clauses between subject-oriented clefts and object-oriented clefts (Friedmann et al. 2009;Skopeteas & Fanselow 2009;Dick et al. 2004;Aravind et al. 2018;Lobo et al. 2019). The simplified derivations consider only the movement of the argument to a higher unlabelled position, as given in (4)-(5).
The crucial difference between the two derivations is clear: in (4), no intervention is at play, while in (5), the object has to cross the subject to reach the relevant position.
In configurations of the type in (5), objects cross subjects in grammatical sentences on their way to a dedicated functional projection. The grammaticality in A'-movement is given by the fact that the moved element and the intervener are not in an identity relation (the features of the moved element and those of the intervener do not fully match), since the displaced constituent bears discourse features (e.g., +Topic, +Focus, etc.). 3 The intervention-based account we investigate here is centred on representational theories of locality (Friedmann et al. 2009;Belletti et al. 2012;Belletti 2015;Martini et al. 2020), in which only morpho-syntactic features (such as number or person, as it will discussed in detail in section 3 and subsection 4.1) play a role.
The main reason for this choice is the nature of our data. Its syntactic annotation, based on Universal Dependencies (Zeman et al. 2020), limits our investigation to a grammatical theory.
We defer the direct investigation of those other theories, either memory-based or similaritybased, that account for processing effects through a set of features (e.g., animacy, definiteness, etc.) that in our annotations would be unrestricted or unattainable (Gibson 1998;Gordon et al. 2001;Gibson 1998;Gibson & Warren 2004;Warren & Gibson 2002;Lewis & Vasishth 2005).
Memory-based accounts explain the asymmetry between subjects and objects in clefts or relative clauses as deriving from a different amount of material stored in memory (Gibson 1998;Gordon et al. 2001;Lewis & Vasishth 2005) or from the joint costs of integrating them within the parsing structure being built (Gibson 1998;Gibson & Warren 2004;Warren & Gibson 2002). The structural integration costs are based on an accessibility scale of referentiality (Ariel 1990).
Similarity-based processing accounts argue that a limitation for memory is due to similaritybased interference (Gordon et al. 2001;. This account only partially overlaps with syntactic locality (Rizzi 1990;Friedmann et al. 2009). Similarity-based processing approaches are defined on (morpho-)syntactic features (type, person, number, gender, case), but, unlike grammatical accounts, also on extra-syntactic features, for example, animacy, and assume that all features equally contribute to memory interference. We cannot easily retrieve the distribution of features such as animacy or referentiality from our data in a simple and reliable way.
Our goal in this paper is to detect whether we can also observe asymmetries concerning distributions and distribution of features between (4) and (5) in grammatical clauses extracted from corpora. 3 The interaction of locality and functional projections has been central in the study of the comparative dimension of cartographic maps in both theoretical (Haegeman 2012;Abels 2012;Rizzi 2013) and developmental perspective (Friedmann et al. 2020;Moscati & Rizzi 2021). We here adopt standard considerations on intervention effects in the spirit of experimental syntax (Friedmann et al. 2009;Belletti et al. 2012;Martini et al. 2020), without assuming any further properties on the status of the landing site.

Quantifying intervention effects in grammatical clefts
In this work, we assume that underlying grammatical properties surface quantitatively (Merlo 2016, Gulordava & Merlo 2020. Our methodology requires comparing the distribution of This quantitative prediction will be made more specific later in the section. It is made explicit here to allow us to illustrate which objects of our counts we need to define precisely. In this work, the core notions are the concept of similarity, central to the notion of intervention, the representations of the elements (subjects and objects) whose similarity needs to be calculated, and the linking hypothesis that connects these definitions to the counts. We provide here some definitions that will be needed to formulate our quantitative expectations.
• Features The head of the cleft and the intervener are represented as vectors of movementrelevant features. Features are (type:value) pairs, such as (gender: feminine).
• Similarity The head of the cleft and the intervener are similar if their features match.
• Feature match A feature match, match f (C,I), is true iff, for a given feature f, the head of the cleft C and the intervener I are instantiated and have the same value. If one of the two elements being compared, the head of the cleft C and the intervener I or both are not instantiated, then match f (C,I) is false. 4 • Linking hypothesis A stronger intervener creates greater inacceptability and hence surfaces less often in a corpus in a match configuration. An intervener strength depends on the number of matching features.
In this work, we make use of observational data provided by corpora, and operate on counts.
We refer here to the notion of observed counts and expected counts.
Observed counts Observed counts are the counts in the corpus.
Expected counts Expected counts are the counts of the features that we expect based on their distribution in settings where intervention is not at play and, therefore, they do not interact with each other. In other words, the expected counts are the counts we would expect given the prior joint probability of the features, the probability to co-occur without intervention.
We build our predictions on the distribution of a series of morpho-syntactic features on structures where no intervention has taken place, such as declarative ("canonical") orders, such as the Italian SVO sentence given in (1). Specifically, an object-oriented cleft clause brings into play the object of the verb and its features, the noun phrase that is being cleft, and the subject of the sentence and its features, the intervener. Precisely, let f so C be the counts of a feature value in a subject and object pair in a sample of size S. Let T be the total number of observed counts.
Then, the expected counts of the subject and object pair feature value occurring in a sentence are calculated as Following Merlo (2016) and related works, we use corpus counts in the spirit of the computational quantitative syntax framework: differentials in observed and expected counts are the expression of underlying grammatical properties. In this respect, our quantitative hypotheses below are to be contrasted to an H 0 hypothesis that would predict that grammatical properties are uncorrelated to observed counts in a corpus, because corpus counts are effects of usage, while grammar makes no predictions about them, and as such there is no expectation of distribution of counts beyond the observed ones.
We can now formulate our specific hypotheses. The first hypothesis is directly derived from the discourse asymmetries discussed in previous sections. No intervener blocks the movement of the subject in subject clefts, while a subject acts as intervener for the object in object cleft

configurations. 5 This asymmetry (no invervention, intervention) should be reflected in different raw
frequencies. Therefore, we expect that the counts of the observed subject clefts in UD treebanks should be greater than the counts of object clefts, in the three languages under investigation.
The raw counts of subject clefts in UD treebanks are expected to be greater than the raw counts of object clefts. 5 As noted by an anonymous reviewer, subject-clefts also reflect the standard word order of the languages (SVO), in contrast, for example, with object and adjunct clefts (cf. Aravind et al. 2018). A different formal model should be implemented to uncover asymmetries between simple reorderings versus non-reordering, which is beyond the scope of the paper.
A second hypothesis concentrates on the relation between expected and observed counts. Our expected counts are estimated on syntactic structures where apparently no movement has occurred (such as canonical, non-reordered SVO structures). Comparing these expected counts to observed counts, we detect how movement, matching or mismatching configurations and locality effects interact with frequencies. Therefore, we expect to find trends between the distribution of frequency of clefts and the number of matches, with mismatching configurations to be preferred to matching configuration in object cleft environments.
H 2 : In object clefts, the observed counts of feature-matching configurations are expected to be lower than their expected counts, and the observed counts of mismatching configurations should be equal or higher than their expected counts.
The relation between expected and observed counts on subject clefts represents control group for H 2 . Despite being a construction where a word order displacement has been triggered, no intervention takes place in subject cleft environments, since the subject does not cross similar elements towards the targeted peripheral positions. We therefore predict that the observed counts of subject clefts will not differ from the expected counts.
H 3 : In subject clefts, the observed counts of both feature-matching and mismatching configurations are expected to be approximately equal to their expected counts.
These quantitative hypotheses will be tested on corpora. In the next section, we present the materials for the investigation of the distribution of a subset of features in canonical orders which will represent the basis for the calculation of the expected counts in clefts. These counts are estimated based on the observed counts of subject and object co-occurrences in canonicallyordered SVO constructions.

Collecting expected counts: feature distributions in canonical orders in UD treebanks
To validate our hypotheses, we must establish the expected counts of morpho-syntactic features in a language based on the distribution of the features in canonical/unreordered environments. In

Features and intervention effects
To detect the nature of the intervener and the material of the long-distance dependency, we choose to investigate the morpho-syntactic features of type, number, person and gender.
The notion of type goes back to the core formulation of intervention locality theory and has been shown to be active in the acquisition of object relative clauses (Rizzi 1990;Friedmann et al. 2009). This feature has two values: an element can be a Head like pronominal elements (e.g. elle 'she' in French) or maximal projections (e.g. la présidente 'the president' in French) which we refer as xp. Secondly, the person feature (1st, 2nd, 3rd) was early noticed (Bever 1974) to be a mitigator in parsing difficult structures involving A'-movement (Chesi & Canal 2019). The third feature under investigation is the morpho-syntactic feature of number (singular or plural), which has been well studied because it is strictly related to the richness of the verbal morphological system (Bentea 2015). Finally, the manipulation of the two-valued gender feature (feminine, masculine) shows interesting results in language acquisition studies in those languages where gender is morpho-syntactically realized (Belletti et al. 2012). To

Feature values
The values of type are xp (Nouns, proper nouns) and heads (pronouns); the feature person has three values: 1 st , 2 nd and 3 rd person; the feature number can take two values: singular and plural; finally, the feature gender also distinguishes between two values, feminine and masculine. For undetermined features that we cannot automatically retrieve from syntactically annotated corpora, we will use the value u. In Figure 1, we show some examples of cleft clauses with these features in the three languages under investigation. We select these features to explore several dimensions of variation.

Calculating Expected counts
As presented in section 3, hypotheses H 2 and H 3 follow a common schema that requires calculating the observed counts of a feature in the corpus and compare it to the counts we would expect if intervention was not at play. The collection of counts is done on syntactically annotated corpora, presented in subsection 4.3.1.

Materials & Methods
To establish a priori expected counts, we first need to observe the distribution of features of subjects and objects where no intervention is at play. Declarative canonical sentences showing subjects and objects provide the right configuration. With the term canonical, we refer to the standard ordering of constituents in which informational properties are clause-related or about the subject (Rizzi 2015, Belletti & Rizzi 2017). In the three languages under investigation, the order SVO of the core elements (Subject (S), Verb (V) and Object (O)) is canonical. 6 Our query retrieves all the occurrences of subjects preceding objects according to the different combinations of features and values in a treebank. The status of null subjects (NS) (Rizzi 1982) in Italian is not investigated. This choice is due to the fact that we need to detect SVO structures where both elements are phonetically overt. Indeed, the search for sentences with null subject will alter the results: as early noted by Grimshaw & Samek-Lodovici (1998: 195), the antecedent of pro, the syntactic element filling the subject position (Rizzi 2015), has a topic discourse status. Similarly, Frascarelli suggests that "[a] thematic NS is a pronominal variable, the features of which are valued (i.e., 'copied through matching') by the local Aboutness-Shift Topic" (Frascarelli 2007: 694). See also (Frascarelli & Hinterhölzl 2007;Bianchi & Frascarelli 2010). Moreover, the lack of a subject (and a subsequent subject position) will not automatically disambiguate the discourse status of the sentence (e.g. subject relative or imperative clauses). 7 Our material is extracted from syntactically annotated treebanks for French, Italian and English, following the guidelines of Universal Dependencies annotation scheme (Zeman et al.  1 (accessed, 13.05.2021). 7 An example is extracted from the Italian treebank ISDT (ID: isst-tanl-3247): Come prima cosa, 2 volte al giorno, prendi un appuntamento con te stessa. 'First of all, twice a day, reserve a meeting with yourself'. The mentioned sentence, whose inflected verb is homophonous with the indicative present, represents a case of an imperative that we cannot automatically discriminate. tokens; legal, news, wiki); the French treebank is the Gsd 2.5 (Guillaume et al. 2019;16342 trees/400396 tokens;blog, news, reviews, wiki); finally, the English treebank is the the Gum 2.6 (Zeldes 2017;5961 trees/113374 tokens;academic, fiction, nonfiction, news, spoken, web, wiki).
All the materials are extracted with the Grew-match tool maintained by Inria in Nancy (http:// match.grew.fr). The query looked for a variable x annotated with a combination of morphosyntactic features (type, person, number, gender) and a subject dependency, a variable y annotated with a combination of morpho-features (type, person, number, gender) and the syntactic label object, so that x precedes y. Table 1 shows some examples of naturally occurring clauses in English extracted from the morpho-syntactically annotated corpora, presented in the sub-section 4.3.1. We work at a morpho-syntactic level and not at the interpretative level. For example, we coded French il expletives as head, sing, 3rd, masc.    (0), pM = person feature, match (1) or mismatch (0).
pronoun il, we only look at the real morpho-syntactic realization of the feature and therefore we will not provide a further coding for you as impersonal element. 8  method would assume that these distributions are similar (or even identical) to the one we need to impute. This is a strong assumption that we wish to avoid. We adopt, instead, a uniform distribution, as the most conservative option.

Distribution of features in SVO and expected counts
A uniform distribution makes the least assumptions about the distribution itself, and implements a maximum entropy model of unknown values. In calculating feature matches, a uniform distribution yields the most entropic matching distribution of two features, and hence constitutes the hardest case to beat for all our hypotheses (see section 3).
Our H 2 makes prediction of both higher than expected and lower than expected counts, our H 3 predicts equal counts and a possible H 0 predicts no match between observed counts and expected counts. In all these cases, the maximum entropy distribution will be the hardest case to refute, on average.

Intervention effects in cleft structures in UD treebanks
This section presents, first, the results of the collection of clefts structures and their manual analysis, classifying them into subject, object or adjunct clefts, to answer H 1 (subsection 5.1).
Then, it compares expected and observed counts in object clefts structures, verifying whether the difference between expected and observed counts predicted in H 2 is confirmed (in subsection 5.2).
Third, subsection 5.3 deals with expected and observed counts in subject clefts: we should not see any asymmetries as predicted by H 3 . The section concludes with a discussion of frequency in cleft structures.

Collecting clefts
For the collection of the observed counts, we exploit all the available UD treebanks to obtain higher frequencies of clefts structures. 9

Materials
We make use of four French treebanks: the above mentioned Gsd

Search
All the materials have been extracted with the Grew-match tool maintained by Inria in Nancy The size of corpus for expected and observed counts is different, but we need to enlarge the number of corpora in order to retrieve as rare a structure as clefts. On the other hand, SVO is a frequent structure and we decided to use the most complete and heterogeneous treebank in each language, as discussed in section 4. 10 https://universaldependencies.org/treebanks/fr_spoken/index.html accessed June 14, 2020.
were language specific, looked for a clefted element x governing a copula dependency on an element y, a complementizer z (e.g., fr. que and qui, it. che, en. that, which, etc.) and a lexical language specific pronoun w (e.g. fr. c', en. it), such that w directly precedes y, the element y precedes linearly x, and the clefted element x directy precedes the complementizer element z. 12 The manual analysis excluded clear cases of non-clefted elements that the query retrieved, such as the English sentence it is a good thing that the details of Members attendance are corrected (English, ID: lines-ud-dev-doc5-3740).
We did not exclude cases of (it-)cleft look-alikes involving restrictive relative clauses from our counts (Cesare et al. 2014;Karssenberg 2017). These structures are composed of a presented "clefted" XP, as given in (6) for Italian, followed by a restrictive relative clause (Karssenberg & Lahousse 2018: 516). These configurations make use of a restricted relative clause, which involves the type of movement and the type of intervention effects investigated in this work.
Hallmarks of movement can be detected via reconstruction effects, as given in (6), in which the possessive pronoun is bound to the subject of the restrictive relative clause. In practice, we found only one case of a "presented" object followed by a relative in English, It is a very good article which no meat-eater can read (English, EWT-at0033), two for Italian and five for French. In (7), two naturally occurring examples are shown, one from Italian, (7)a, and one from French, (7)b. 13 (7) a. È un incremento che neppure gli assessori sanno spiegare Is an increase that even.neg the city-councilors know explain 'It is an increase that even the city councilors are not able to explain' (Italian, tut-2968) b. C'est un outrage que nous n' acceptons pas It.is an outrage that we neg accept neg 'It is an outrage that we don't accept' (French, Spoken-Rhap_D2006-2) 12 The list of queries can be found in Appendix E. 13 We here report the other four cases of cleft-lookalike in French: Mais c'est une hypothèse que je ne veux pas prendre en considération. 'but it is a hypothesis that I don't want to take into consideration' (GSD-fr-ud-train_09322); et euh c'est un endroit que j'aime pas 'it is a place that i don't like' (Spoken-Rhap_D0006-40); donc ça c'est un endroit que j'aime pas trop euh 'it is a place that i don't like too much' (Spoken-Rhap_D0006-43); C'est un site que nous n'oublierons jamais 'it is a site that we will never forget' (GSD fr-ud-train_05615). The other Italian occurrence will be discussed in sub-section 5.2, example (8).

Results
Our results confirm similar distributions of clefts in corpora in Italian and French as previously observed in spoken corpora (Roggia 2008) and for English, French and Italian in the parallel treebanks of the European Parliament proceedings (Dufter 2009: 90). Table 3 shows that cleft structures represent a very little proportion of the data, less than 1% in the three languages under investigation (0.008% in French, 0.006% in Italian and 0.003% in English). The different distributions may suggest that, in French, cleft formation are more productive than in Italian and in English. A plausible reason is that cleft structures in French are a preferred strategy in answering questions (Belletti 2010). Table 4 shows the manually-collected total raw counts of cleft structure and their partition into the subtypes that are of interest to us here: subject (subj), object (obj), adjunct (adj).
The distribution in French shows a higher productivity of subject clefts (0.47) over object clefts (0.13). A good proportion of clefts are also adjunct clefts (0.40), whose nature is investigated in section 6. Similar distributions are observed in Italian and in English. In both languages, subject clefts (respectively 0.22 and 0.26) are produced with a higher frequency than object clefts (respectively 0.09 and 0.08), but with a lower percentage than adjunct clefts (respectively 0.69 and 0.66). These results were also predicted in Belletti (2010): French adopts the strategy of clefts for focalizing verbal arguments (subject, object) in a more productive way than Italian (or English).   detailed counts of the feature matching configurations are given in the appendices. We now turn to verifying the more specific hypotheses. Recall: H 2 : In object clefts, the observed counts of feature-matching configurations are expected to be lower than their expected counts, and the observed count of mismatching configurations should be equal or higher than their expected counts. Numerically, this calculation will correspond to 0.27 (the proportion of four-feature matches in Italian, see Table 3) multiplied by the size of object-oriented clefts, namely 19 (see Table 4), which gives us an expected count of 5.13.

Whether these expected and observed counts confirm or reject the hypothesis is established
by a binomial test. The binomial test gives us the probability of k successes in n independent trials, given a base probability p of an event. So, for example, the binomial distribution tells us the probability of a four-feature match in Italian in a subject-oriented cleft. The event in this case is the cleft construction (whose probability is indicated, for example, as 0.006 in Italian).
The quantity n is the number of four-matching features in a canonical configuration, and k is the number of four-matching features in a subject-oriented cleft. If certain conditions are met, the binomial distribution can be approximated by the normal distribution and a significance test can be performed. We calculate the cumulative probability distribution: the probability that the observed counts are exactly as observed, or greater, if the observed counts are larger than the expected counts, or the probability that the observed counts are exactly as observed, or smaller, if the observed counts are smaller than the expected counts. The z-score gives us the (one-tailed) probability of exactly, or greater/smaller counts than the expected counts. We first operate on object clefts in subsection 5.2 and then we turn in investigating subject clefts in 5.3 where no intervention effects are active.

Hypothesis 2: object clefts
We clearly observe cases of intervention effects in object clefts. Comparing between expected and observed counts, asymmetries between feature-matching and feature-mismatching configuration clearly emerge. The calculations of expected counts and actual observed counts, the probabilities of these observations under a binomial distribution and their statistical significance are shown in Table 5 for the three languages.  Table 5: Expected counts and observed counts for object clefts. Numbers in parentheses are the ceiling or floor rounding to the nearest integer. p is the prior probability of the event.

Italian
Binomial p indicates the probability of the observed counts under a binomial distribution (the binomial test). z-p is the statistical significance of the binomial probability. n.v. indicates that conditions are not met for a valid calculation of statistical significance. The z-p gives us the (one-tailed) probability of exactly the observed, or greater/smaller counts than the expected counts, for α = 0.5.

Results
The expected interaction is numerically clear in all three languages under investigation. As it can be observed, the counts of matching configurations are lower than expected and counts of mismatching configurations are higher than expected. In French and Italian, the situation is extremely clear. If all the features of the dislocated object match with those of the subject, the number of observed counts (k) decreases. For example, as it can be seen in Table 5, prior probabilities suggested that, due to the proportion of cleft clauses found in our French corpora and the distribution of features found in SVO sentences, we should expect around 3.45 object clefts having four matching features; however, 0 are observed (similarly in Italian: 5.13 expected, 1 observed given in (8), a case of a cleft-look alike).
(8) È una campionessa che la Gran Bretagna rimpiangerà a lungo. Is a championf.s that that Great Britain regretfut for long 'She is a champion that Great Britain will regret for a long time' (ID: isst-tanl-2357) The differential between expected and observed decreases with 3 matching features, while with 2 matching features expected and observed counts are similar, as predicted. Finally, in full mismatching configurations (0 matches) we can observe a high number of observed clefts in all languages with respect to the expected counts (French 0.23 expected, 4.5 observed; Italian 0.19 expected, 1.5 observed; English 0.66 expected, 2 observed). An example of a full mismatching configuration in French is given in (9).
(9) C'est du bon boulot que vous m' avez fait, les gars! It.is of good job that you have.1pl to-me made the boys 'It is a good job that you have done for me, pals' (ID: gsd-ud-train-13282) We now turn to subject clefts in section 5.3. In this case, expected counts should be similar to observed counts, since no intervention effects plays a role.

Hypothesis 3: subject clefts
Analogously to the object clefts, we here investigate only those subject clefts with transitive verbs and therefore sentences with both subjects and objects overtly realized. 14 Interestingly, the distribution of transitive and intransitive verbs in French and Italian is similar. In French, out of 86 clefts, 56% (k = 49) are transitive, while in Italian the proportion is equally distributed 22 out of 45 (49%). English shows an uneven distribution with 75% of transitive verbs among the 14 We considered copular sentences as transitive, following Moro 1997, so that we can analyse the case of matching and mismatching configurations between subjects and objects. Reflexive verbs were considered transitives. retrieved clefts (k = 18). Differently from object clefts, we should not observe clear asymmetries between expected and observed counts, but rather the number of the observed counts should be approximately equal to their expected counts. Table 6 shows the results.

Italian Matches
Exp.  Table 6: Expected counts and observed counts for subject clefts. Numbers in parentheses are the ceiling or floor rounding to the nearest integer. p is the prior probability of the event.

Obs. (k)
Binomial p indicates the probability of the observed counts under a binomial distribution (the binomial test). z-p is the statistical significance of the binomial probability. n.v. indicates that conditions are not met for a valid calculation of statistical significance. The z-p gives us the (one-tailed) probability of exactly the observed, or greater/smaller counts than the expected counts, for α = 0.5.

Results
As expected, mismatching configurations in features do not increase the frequency contrary to

Discussion
As predicted by intervention, a clear asymmetry emerges between subject and object clefts.
Moreover, as predicted by a feature-based specification of intervention, when intervention is at play (object clefts), the differences are gradual and suggest that the number of matching features matters. The same (or an opposite) correlation does not arise in subject clefts. Figure 2 shows the correlations on the aggregated data. In object clefts, we find a negative correlation of the difference between observed counts and expected counts and the number of matching features. We do not find a significant correlation in subject clefts.
All the hypotheses based on the theoretical account of intervention effects are confirmed. On the basis of the elements presented in subsections 5.2 and 5.3, we have observed that the asymmetry  Another important contribution of this analysis is the observation of the frequent occurrence, especially in Italian and English, of another type of cleft construction, adjunct clefts, whose locality properties were not dicussed in Belletti (2015). Section 6 investigates the nature and the intervention effects in these structures.

A further puzzle: adjunct clefts
One observation concerning the data of Table 4  Not all adverbial clefts are expected to trigger intervention effects. The subject is an intervener in terms of locality only if the adverbial element undergoes movement, as it is extracted from the syntactic structures and it is not directly merged in the higher structure of the cleft (the layer of the copula). 16 We can represent this difference as (10) and (11). In (10), we observe modifiers that are not extracted and clefted, but they might be simply generated in the IP of the higher clause (Cinque 1999). Even if it surfaces as a cleft, the configuration might be considered a simple predicative structure of the type (it is clear that…) or (it is obvious that…). No intervention is at play, since the related sentential adverb (e.g., clearly, obviously, etc.) is directly generated in the IP of the higher clause and it does not cross any intervener. 15 We acknowledge the limitation of our dataset. The dataset, however, was built with a semiautomatic procedure and manually curated, so that even a small amount of data will have enough signal to detect the pattern we are hypothesizing. The development of a finer-grained method able to automatically retrieve clefts (and other complex structures) from unannotated corpora in a larger set of languages will be left to future studies. 16 Some kinds of adjunct elements appear to exhibit forms of intervention. Temporal and locative elements are able to satisfy certain types of subject requirements, in the so-called locative inversion in English (Rizzi & Shlonsky 2007) and locative/temporal inversion in other Germanic languages (Samo 2019). It appears, then, that adverbial elements are able to undergo both A and A'-movement (Rizzi 2004), but an adverbial crossing an intervening subject does not trigger the same effects of similarity that is triggered with a crossing object. This difference can perhaps be ascribed to the fact that, feature-wise, adverbials and subjects share only a very partial subset of features, unlike subjects and objects.
(10) IP generated adverbial On the other hand, there are cases in which adverbial/complement elements are generated in the IP of the lower clause and then they move, like the objects, to the relevant left peripheral positions and thus clefted (Schweikert 2005). The derivation is given in (11).
To remove the confound produced by the clauses of type (10), in collecting our counts, we manually analyse the results and only consider the cases of adverbials that unambiguously have undergone movement from the lower clause. 17 Adverbials do not bear any person marking and not every adverbial is marked with number or gender features, so that the only possible feature that could result in intervention effects with all adverbials is the feature type: adverbs can be realized as head elements (e.g. never in English) or a maximal projection xp having nominal elements features (e.g. the day before yesterday in English) in the three languages under investigation here. We conjecture, then, that the intervention effect is weaker than a full feature match, because only one feature matches, but stronger than the equivalent one-feature match in object clefts because the matching feature is a larger proportion of the smaller total number of features at play. 18 We can thus formulate a fourth hypothesis to be investigated in this section: adjuncts exhibit intervention effects in clefts, but in a weaker form than objects.
H 4 : In adjunct clefts, the observed counts of feature-matching configurations for the feature Type should be marginally lower than their expected counts, and the observed count of mismatching configurations should be equal or marginally higher than their expected counts.
We estimate the expected counts of adjuncts with a method that differs from the one presented in the previous sections. This is necessary to reach reliable counts since not every canonical structure shows the presence of both subject and overt adverbials, as explained in the next subsection.

Expected counts on adverbial clefts
The calculation of intervention in adverbial contexts takes into account only the feature type. The materials we sample to count the feature are, for Italian and French, the same treebanks used 17 For example, the French modifier vrai 'true' is not extracted from the sentence as an IP internal head: the example from the treebank spoken, id:Rhap_D0004-23, "mh mh donc c'est vrai que ça fait quand même euh beaucoup de changements" (lit. 'interjection then it's true that it is the same interjection many changes'). 18 The precise specification of this similarity calculation requires more empirical support and is left for future work. before, as reported in section 4; for English, we select the biggest treebank available in terms of trees, namely Ewt 2.5 (Silveira et al. 2014;trees 16622;tokens 254856;blog, email, reviews, social).
The method we use for the estimation of expected counts with adverbial elements needs to be adapted. We cannot calculate expected counts simply on the co-occurrence of subjects and adverbials, analogously to what was done for subjects and objects, because adverbs are not selected by the argument structure of the verb. For adverbs, we used several sources of information to identify the value of the feature type. In some cases, we counted the value of the feature type in terms of Pos tags of the elements governed by the dependency relation advmod in the annotation scheme of UD (Zeman et al. 2020). We considered adpositions, pronouns and particles as head, and we labelled determiners, adjectives, nouns and proper nouns as xp. We did not count a subset of PoS tags (Ncl in appendix), such as non-annotated elements, conjunctions and verbs. The adverbial PoS tag adv is ambiguous between head or xp, but it represents a good portion of the data. In this case, we developed a query to extract all the occurrences of adverbials. 20 Then, a sample of 1000 sentences for every 19 Given that we take into consideration all the subjects in the treebank, we need to account for null subject environments. In Italian, the subject can be omitted in most contexts without triggering ungrammaticality (Rizzi 1982). A form of null subjects in a much reduced set of specific environments (e.g. diary texts) is also allowed in French and English (Haegeman 1990). 20 We excluded (negative) polarity items which might have added some noise to the results and whose status in the syntactic architecture is under debate (Moscati 2012). language were selected and analysed manually. Following Cinque (1999), we labelled xp all the adverbial generated in the specifier position of an IP-internal functional projections, such as those adverbs bearing prepositions or morphemes as -mente in Italian, -ment in French, and -ly in English. Table 7 summarizes the results showing the probability of a subject being xp, head, null and an adverb being xp, head, in the different treebanks. The more detailed relevant tables are provided in the Appendix.

Results: adverbial clefts
The expected counts discussed above are compared to the observed counts retrieved from the materials discussed in Table 3 in section 5. The results are given in Table 8. As it can be observed, only one fifth of the cleft elements are extracted in Italian and only one element is extracted in English. 21 French is more productive and shows a good portion of extracted elements.
Since English does not provide enough evidence, in what follows, we perform our analysis of adverbials only on French and Italian.
Hypothesis H 4 states that in adjunct clefts, the observed counts of feature-matching configurations for the feature Type are expected to be lower than their expected counts, and the observed count of mismatching configurations should be equal or higher than their expected counts. 21 The sentence (Treebank GUM, ID:GUM_interview_mckenzie-7, it was a tournament that I wish we 'd actually gone back to more often).   We manually coded the case of matching or mismatching configurations. Table 9 shows the results of observed and expected counts. As it can be observed, there is a marginal decrease in observed matching configurations and a marginal increase in observed mismatching configurations. This is expected, if we recall that only the feature type was taken into account and therefore adverbial elements behave as "intermediate" levels of matching configurations (1 and 2 matches), compared to object clefts in Table 5. Results seems to confirm that the effect of locality is small, but present, with a marginal preference for mismatching configurations.

Discussion
In this paper, we establish several results, predicted by an intervention locality theory of cleft formation and acceptability. The contributions are three-fold. We provide further evidence in support of the predictions of an intervention theory of locality, based on comparisons of quantitative properties of grammatical data. The fact that we find quantitative confirmation also supports our view of the interaction between observational properties of use and grammar, where the latter leads the former. Finally, we provide a relatively simple and precise methodology that can be used as a blueprint for many other theory-driven quantitative studies.

Extending the supporting empirical evidence
The quantitative evidence reported here lends support to the predictions of an intervention theory of locality. First, we establish that the quantitative properties of cleft constructions reproduce the binary distinction visible in the qualitative difference of available interpretations between subject and object clefts. This is confirmed by two results. On the one hand, subject clefts, where  Table 9: Expected versus observed counts for adjunct clefts in French and Italian. p is the prior probability of the event. Binomial p indicates the probability of the observed counts under a binomial distribution (the binomial test). z-p is the statistical significance of the binomial probability. The z-p gives us the (one-tailed) probability of exactly the observed, or greater/smaller counts than the expected counts, for α = 0.5.
no intervention is at play, are more frequent than object clefts, where intervention is at play. On the other hand, object clefts are less frequent than expected in intervention configuration, while subject clefts are roughly as frequent as expected. These findings then contribute to the linguistic and psycholinguistic investigation of the formal encoding of long-distance dependencies, following the theoretical lines laid in the first formulation of intervention theory of long-distance dependencies (Rizzi 1990), made gradual and more fine-grained in subsequent work (Rizzi 2004), and verified esperimentally in both sentence processing and acquisition (Franck et al. 2015;Villata et al. 2016;Friedmann et al. 2009).
Second, we also find that the differential and direction of difference between expected and intervention is a kind of interference at retrieval in memory (Smith et al. 2021 and reference therein). We provide new quantitative data derived by a feature-based explanation, but that that correlate with a gradient of corpus counts. In this way, we expand the set of evidence on which the two approaches could be tested. In particular, the grammar-based approach will need to be developed in detail to formalise better the notion of strength of intervention and the mechanisms giving rise to gradient grammaticality judgements and corpus counts.
While we have not provided in this paper a direct mechanistic model of intervention, the outcome of our quantitative investigations are relevant for the increasing body of computational research that attempts to reverse engineer current neural networks models to establish the boundaries of what they can learn. These studies have concentrated on structural grammatical competence, exemplified by long-distance agreement and relative clauses and islands, phenomena that also trigger locality effects, and have demonstrated that neural networks can learn longdistance dependencies to an interesting extent (Linzen et al. 2016;Wilcox et al. 2018), but do not fully show intervention effects (Merlo & Ackermann 2018;Merlo 2019). The results of the current paper are relevant for this debate as they demonstrate that any discrepancies between the human results and the machine results are not due to lack of sufficient statistical signal in the data, but are morel likely to be found in properties of the learning algorithms.

The interaction between usage and grammar
Our work can also be read as a specific proposal on the role of quantitative properties in the theories of grammar. Frequency is a puzzling property of language constructs, whose correlation with other aspects of grammatical representations or other linguistic observations is not clear.
Most linguistic approaches are in agreement in assuming that frequencies are an expression of language use, but their views of the relationship to grammar diverge. 22 Functionalists assume that usage shapes grammar, and that frequency of use is the cause of some prominent linguistic effects (see Bybee 2007; 2010, among many others). In usage-based theories, frequency of linguistic events determines how automatised, how entrenched, how easy to memorise, how robust they are to change (Tomasello 2009;Evans & Levinson 2009;Bybee 2010;Ibbotson 2013). Linguistic structure is the emergent property shaped by more general cognitive principles of categorisation, generalisation, analogy. The mechanisms by which structure emerges are not always clear, but they are assumed to be based on the combined effect of frequency and similarity, which give rise to grammatical schemas and categories.
For example, Bybee (2007) and others have argued that exemplar theory provides a specification of how frequency effects bring about structure in language. Categories are are induced from observed instances with overlapping properties that are grouped together in memory. Frequency distributions and their direct relation with structural recursion or their inverse relation with some notions of acquisition or processing complexity have also been explained as an effect of pressure for efficient communication (Dryer 1992;Hawkins 1994;Gibson 1998;Tily et al. 2011;Zipf 1949).
From a generative or cognitive point of view, frequencies are not part of the grammar or the cognitive system (see an early discussion in Pinker 1991, and a few, notable recent exceptions, such as Yang 2016; Yang et al. 2017). This point of view assumes that frequency-based, quantitative properties of text are unrelated to the underlying grammatical representations of language that linguistic theory proposes.
In this paper, we develop a point of view on frequency that tries to reconcile these views and is based on the idea that frequency is neither the independent variable in the explanation nor irrelevant to language. Current large-scale, syntactically-annotated resources for several 22 According to (Haspelmath 2006: 16), "Frequency of use is a property of parole or performance, not of language structure or competence, and throughout the 20th century most linguists have shown little interest in explaining structure in terms of use." languages allow us to develop investigations of the correlation between quantitative linguistic properties and theory-driven abstract linguistic representations and operations. Quantitative confirmation of the theory and its precise internal aspects (kind and size of feature set, for example) also supports our view of the interaction between observational properties of use and underlying abstract grammatical principles. As observed in the introduction, following Merlo (2016) and related works, we use corpus counts in the spirit of the computational quantitative syntax framework: differentials in observed and expected counts are the expression of underlying grammatical properties.
This point of view on the interaction between grammar and corpus frequencies is different from a point of view where corpus frequencies are expressions of usage only and determine the shape of the grammar. We think that the usage-based point of view predicts that observed counts are the same as the expected counts. There is no conceptual difference, in that the grammar, being shaped by use, in principle, cannot give rise to expectations that are different from observations.
Lack of a predictive relation is also expected by a point of view where usage is performance and is unrelated to underlying grammatical competence. The point of view that considers competence and performance two separate and not necessarily related aspects, predicts no correlation, or at least only accidental correlations.
We present a view and a method that predict frequencies of cleft sentences based on formal grammatical principles. The predicted counts are the dependent variable in a grammatical model whose independent variable is the complexity of the representations. Specifically, the complexity of the tree representations and feature-based similarity operations. This methodology assumes that underlying grammatical properties surface quantitatively, once independent influences of use are properly factored out. In so doing, it rejects the usual distinction between competence and performance and it reapportions some aspects of usage and frequency to a theory of competence. More generally, this approach is analogous to the proposal discussed in Bresnan et al. (2001) that a given language's hard constraints can be mirrored in another language as soft constraints. 23 Specifically, Bresnan et al. (2001) observed that the disharmonic configurations for person/arguments avoided in the Salish language Lummi are also statistically, in terms of frequencies, avoided in a corpus of spoken English. We demonstrate in this paper that a similar generalization can be drawn within a single language. The type of similarity in features that create ungrammatical structures (hard intervention effects) are also statistically dispreferred in grammatical structures (soft intervention effects). We here restrict our analysis to cleft structures, but similar methodologies could be implemented in other configurations.
23 …"the same categorical phenomena which are attributed to hard grammatical constraints in some languages continue to show up as statistical preferences in other languages, motivating a grammatical model that can account for soft constraints" (Bresnan et al. 2001: 1).

A method for theory-driven use of observational data
A final contribution of this work is that we provide a relatively simple and precise methodology that can be used as a blueprint for many other theory-driven quantitative studies. It is important to extend theory-driven grammatical work beyond qualitative introspection and the experimental method. Experimental and observational data integrate the nuanced verification of fine-grained theories that is possible with quantitative data, going beyond sometimes coarse qualitative characterisations. But observational data also bring the added richness of naturally occurring and large-scale data, data that is, in the end, our primary source of evidence.

Conclusions
In this work, we have shown that it is possible to provide a fine-grained analysis of clefts and novel evidence based on corpora and corpus counts. The main novel finding is that intervention effects that create ungrammatical structures seems to be present also in grammatical configurations, if one looks at fine-grained quantitative evidence. Further research is required to detect the nature of those elements and their position with respect to principles of locality and extend the findings to more languages.