What the PCC tells us about “ abstract ” agreement , head movement , and locality

∗For comments and helpful discussion, I thank Ümit Atlamaz, Jessica Coon, Vera Gribanova, Boris Harizanov, Norbert Hornstein, Ruth Kramer, Howard Lasnik, Theodore Levin, Jeff Lidz, David Pesetsky, Carolina Petersen, and Maria Polinsky; audiences at UMD’s S-Lab, NELS 46 (Concordia), ZAS (Berlin), and the Stanford Workshop on Head Movement (September 2016); and the participants in the fall 2016 seminar on Head Movement, Noun Incorporation, and Clitic Doubling at the University of Maryland. All errors are my own. ◦ i.e., attempts to reduce various other linguistic phenomena to the same formal operation hypothesized to underpin agreement · e.g. Chomsky’s (2000, 2001) Agree


Introduction
It has become exceedingly common in contemporary linguistic theorizing to come across claims of the following sort: "It may appear that verbs in language L do not agree with their arguments, but that is just an arbitrary fact about the morpho-phonology of L. In other words, the relevant exponents in L just happen to lack segmental content.Syntactically, agreement is operative in L just as it would be in a morphologically richer language."I will refer to this type of analysis as abstract agreement, by analogy with abstract case (Vergnaud 1977;Chomsky 1981).For a representative example of an analysis that resorts to abstract agreement, see Chomsky (2000: 123ff.) on supposed agreement between v and the direct object in English.
The primary goal of this paper is to show that when it comes to agreement in ϕ-features (person, number, gender/noun-class), this type of reasoning is almost always mistaken.I will show that, generally speaking, there is no such thing as abstract agreement; ϕ-feature agreement is only there when you can see it.
Until quite recently, the existence of abstract agreement would have seemed inevitable, as a consequence of the following widely held premises: (1) a. premise 1: structural case is assigned as a consequence of agreement in ϕ-features (Chomsky 2000;2001) b. premise 2: noun phrases that are not assigned inherent case must receive structural case, or else ungrammaticality arises (Vergnaud 1977;Chomsky 1981) Taken together, these premises entailed that any noun phrase that could not plausibly be analyzed as a bearer of inherent case had to be the target of an agreement relation; and insofar as there was no morpho-phonological evidence of such a relation (as is the case with, e.g., direct objects in English), abstract agreement had to be at play.By now, however, it has become quite clear that neither (1a) nor (1b) is correct. 1 The case-theoretic underpinnings of abstract agreement have therefore faded away, rendering its existence ripe for re-evaluation.
In the course of this investigation, we will encounter one notable exception to the generalization that there is no abstract agreement.This exception involves the agreement relation that prefigures clitic doubling.Following Rezac (2008a); Roberts (2010), a.o., I take clitic doubling to be an instance of syntactic movement.We will see that syntactic movement is not generally dependent on a prior agreement relation (contra Chomsky 2000, et seq.; see section 6.2).To explain why clitic doubling in particular does require a prior agreement relation, I will offer a novel perspective on the interaction of locality, head movement, phrasal movement, and the Principle of Minimal Compliance (Richards 1998;2001).
The aforementioned clitic-doubling exception also means that the ban on abstract agreement is unlikely to have the status of a steadfast principle of grammar.I will suggest, instead, that these facts may arise by way of a conservative acquisition strategy with respect to the placement of unvalued ϕ-features on functional heads.

A note on terminology
In the context of this paper, the term agreement refers to transmission of ϕ-feature values (person, number, gender/noun-class) from a noun phrase to a functional head.
Recent years have seen a flurry of putative reductions-to-agreement: attempts to reduce various other linguistic phenomena to the same formal operation hypothesized to underpin agreement.Examples include: Binding Theory and fake indexicals (Kratzer 2009;Reuland 2011;Rooryck & Vanden Wyngaerd 2011), negative concord (Zeijlstra 2004;2008b), modal concord (Zeijlstra 2008a), noun-modifier concord (Mallen 1997;Carstens 2000;Baker 2008), and the formation of in-situ questions (Bobaljik & Wurmbrand 2014).These reductions are not the primary focus of this paper; though if the paper's conclusions are correct, they cast considerable doubt on the veracity of some of these reductionsin particular, those that avail themselves of abstract agreement in ϕ-features (see also section 8).
The PCC (also known as the "*me-lui Constraint") is a restriction governing the person features of different arguments in relation to one another, usually affecting combinations of multiple internal arguments of a single predicate.It is therefore most commonly found with ditransitive verbs.It is demonstrated here using data from Basque.
As a first approximation, the PCC in Basque can be characterized as in (2): (2) the PCC in Basque (first approximation) In finite clauses, the direct object of a ditransitive verb must be 3rd person.
As (3a-b) already illustrate, the PCC is fundamentally asymmetric: it restricts the person features of the direct object in the presence of an indirect object, but there is no corresponding restriction limiting the person features of the indirect object in the presence of a direct object.The PCC is asymmetric in another way: it restricts only the person features of the relevant argument-and not, for example, its number features.As Nevins (2011: 944) puts it, there is no Number Case Constraint.
Another noteworthy property of the PCC is that it seems to only arise when there is overt morphology reflecting ϕ-agreement with the relevant arguments.This has led some to view the PCC as a morphological filter (see, e.g., Bonet 1991;1994).I return to this point in section 5.
This pattern, demonstrated in Basque, is one of a family of effects documented in the literature that have been referred to as "PCC effects."Scholars distinguish (2), which has been referred to as the Strong variant of the effect, from the Weak, Me-First, Total, Super-Strong, and Ultra-Strong varieties.(The latter is also known as Strictly-Descending, and is a conjunction of the Weak and Me-First varieties.)See Haspelmath (2004); Anagnostopoulou (2005); Nevins (2007); Graf (2012); Sturgeon et al. (2012); Doliana (2014), for discussion.Of course, referring to all of these as PCC effects is a terminological choice; the extent to which they represent a unitary phenomenon is a matter of analysis.For example, the so-called "Total PCC" is just a prohibition on any combination of two weak pronominal objects, and as such may just be a matter of prosody.For the remainder of this paper, I will assume that at least those variants that are sensitive to person features (i.e., all but the Total variant) can be treated as a unitary phenomenon at a sufficient level of abstraction (see also section 4).
These facts have several consequences.First, they show that at least in Basque, the effect in (2) (the PCC in ditransitives) is actually a subcase of a slightly broader pattern: the PCC in Basque (revised version) In those finite clauses that have a dat argument located higher than the abs argument, the abs argument must be 3rd person.
Because ditransitives in Basque always adhere to a erg≫dat≫abs structural hierarchy, the effect described in ( 2) is derivable as a special case of ( 9).Second, these data show that the effect in question cannot be a morphological filter: the putative auxiliary form in (8b) is morphologically identical to the one in (7).Either the form in question is morphologically licit (in which case the wrong prediction is made for (8b)), or it is not (in which case the wrong prediction is made for ( 7)).Insofar as there is a meaningful distinction between syntax and morphology, the finer hierarchical organization of arguments relative to one another-which is what distinguishes these two casesis the purview of syntax, not morphology.
We could, of course, endow one of the datives in (7-8) with a diacritic that is missing on the other, and grant morphology access to this diacritic in evaluating PCC violations.But since there are no actual differences in the morphology between the two types of datives (neither in dependent-marking nor in head-marking), this would amount to a restatement of the problem faced by morphological analyses of the PCC, not a solution to it.And it would make the correlation with structural asymmetries (4-5) accidental (cf. the syntactic analysis, surveyed below).
An alternative would be to grant morphology access to finer structural distinctions of the sort shown in (4-5).It seems to me, however, that this would stand in rather blatant violation of the point of modularizing the grammar in the first place.We could therefore rephrase the point being made here as follows: either the PCC is syntactic in nature, or else there is no meaningful distinction between syntax and morphology qua grammatical modules-in which case we could still say that the PCC is syntactic, without any loss of generality.
The same results rule out accounts of the PCC in terms of usage-based grammaticalization (cf.Haspelmath 2004).These rely on the idea that configurations involving 1st/2nd person animate direct objects together with indirect objects are exceedingly rare in naturally-occurring data; and thus, the relevant morphological combinations fail to undergo grammaticalization.As (7) demonstrates, the target form in (8b) is in no way missing from the grammatical vocabulary of the language, therefore this cannot be the source of ill-formedness in (8b).
How, then, does the PCC arise in syntax?And why is it sensitive to the structural hierarchy of internal arguments, in the manner shown above (i.e., dat≫abs versus abs≫dat)?A body of work by Anagnostopoulou (2003;2005), Béjar & Rezac (2003), Nevins (2007), and others, has already provided the answer to these questions.The remainder of this section is devoted to summarizing, by way of example, Béjar & Rezac's (2003) proposaland showing how it derives the PCC in a way that is sensitive to structural hierarchy.
Before proceeding, it is worth pointing out that the Béjar & Rezac account is specifically tailored to derive the Strong PCC, and it is not immediately clear to what extent it can be modified to account for other variants of the PCC (see section 3).However, as will become clear shortly, the only part of the account that is crucial to what follows is that the PCC arises as the result of how minimality applies to syntactic agreement (cf.Nevins 2007 for an alternative account, which extends to other PCC variants, but still depends crucially on how syntactic agreement interacts with minimality).Béjar & Rezac's account is therefore intended, in the present context, as a representative example of the broader family of minimality-based approaches to the PCC.
Let us first consider monotransitives and unaccusative intransitives-configurations where there is only one non-oblique internal argument. 2In this scenario, a probe seeking valued ϕ-features will reach the internal argument without impediment: This will give rise to what one would typically call "object agreement morphology." Adding a dative co-argument to (10) located lower than the other DP will not affect ϕ-probing, since minimality dictates that the closer of the two will be targeted: However, adding a dative co-argument to (10) that is located higher than the other DP will result in intervention, and the disruption of ϕ-probing: (On dative intervention, as well as the more general inability of dative nominals to satisfy probing for ϕ-feature values, see Preminger 2014: 129-175 and references therein.)We now add one final ingredient, the Person Licensing Condition (Preminger 2011b: 930-934;cf. Nichols 2001;Béjar & Rezac 2003;Bruening 2005;Baker 2008, a.o Together, ( 13) and ( 14) predict that a 1st/2nd person non-oblique internal argument in a dat≫abs structure like (12) will be illicit, because its [participant] feature will fail to participate in a valuation relation with the ϕ-probe.In contrast, a non-oblique internal argument will be licit regardless of its person features in the other two structures (10-11)-because in those cases, the ϕ-probe can access this argument. 4 The result is precisely the effect exemplified using the two types of two-place unaccusatives in (4-5, 7-8), and summarized in (9).
There is a tantalizingly simpler version of the PLC, which has been put forth in the literature (see, e.g., Baker 2008: 126-150, Béjar & Rezac 2003: 53, a.o.;cf. Nichols 2001: 525-526), which can be stated as follows: a [participant] feature on a DP must participate in a valuation relation.This simpler formulation has proven inadequate, however.As Preminger 2011b has shown, [participant]-bearing nominals are able to occur in positions that could not have been targeted for agreement.In Basque, for example, the absolutive argument of a ditransitive can be 1st/2nd person, provided that the clause is not one in which agreement is observed: Basque (Laka 1996) Gaizki irudi-tzen φ-zai-φ-t [ zuk ni wrong look-impf 3.abs-√-sg.abs-1sg.datyou.ergme(abs) harakin-ari sal-tze-a ]. butcher-art.sg.dat sold-nmz-art.sg.(abs) 'It seems wrong to me for you to sell me to the butcher.'This is so despite the fact that dative arguments are higher than their absolutive coarguments in ditransitives like saldu ('sell') (Elordieta 2001), and that non-clitic-doubled datives are interveners for agreement in Basque (Preminger 2009).In other words, the [participant] feature on the absolutive argument in the embedded clause in (15) could not have possibly participated in a valuation relation; and yet no PLC violation arises.The same can likely be said for 1st/2nd person nominative arguments in certain embedded infinitives in Icelandic (Preminger 2011b: 932).
This means that the PLC cannot be reduced to properties of the ϕ-probe or the goal DP unto themselves; instead, it must be viewed as a relational or processual requirement.There was at one point a prominent hypothesis, embodied in Chomsky's (2000;2001) uninterpretable features system, which stated that constraints like the PLC should always be reducible to a representational property of one (or both) of the elements under consideration.As Preminger (2014) has shown, however, this hypothesis is false: even the obligatoriness of regular ϕ-agreement cannot, in the general case, be reduced to properties of the probe and/or the goal alone, and must instead be thought of as a relational or processual requirement (Preminger 2014: 85-100).The PLC, as formulated in (13-14), is just one more such case.
Before concluding, let me address a putative alternative view of these results.Suppose that we were to concede everything stated in this section so far, but insist that the Person Licensing Condition was itself a morphological filter: the requirement would not be that [participant] on a DP participate in a (syntactic) valuation relation-but rather that, if the verb is capable of reflecting agreement morphology, 1st/2nd person morphology on a DP must be reflected on the verb.The factors determining whether agreement with the nonoblique DP did or did not occur would still be syntactic, but the constraint responsible for the PCC would operate in the morphological component.
The problem with such a view is, again, that it flies in the face of what the separation of syntax from morphology is supposed to accomplish, in the first place.Note that the PLC can be satisfied at arbitrary linear and structural distance, as far as the representation handed off from syntax to morphology is concerned; that's because the nominal targeted by the ϕ-probe can subsequently undergo movement to an arbitrarily distant position: (16) Importantly, this is not just an artifact of the PLC as a theoretical construct.Basque, for example, allows scrambling of DP arguments, and thus, the DPs involved in PCC effects in Basque can be arbitrarily far away from the finite verb or auxiliary which hosts the relevant morphology.
While there have been recent proposals that allow morphology to traffic in objects like chains, copies, traces, etc. (cf. Marantz 1991;Bobaljik 2008), modular separation should entail some difference in the sets of primitives available to each module.That is not to say that the two sets should be disjoint: there must be some overlap in the primitives of syntax and morphology, otherwise the output of one would be wholly illegible to the other.Heads and their features seem like good candidates to fill this role of "shared vocabulary" between syntax and morphology.But if there was ever a candidate for a primitive that is syntactic but not morphological, it would be the chains/copies/etc.formed by syntactic movement.Consequently, insofar as there is any meaningful distinction between syntax and morphology, relations like those in (16)-those that can cross arbitrary linear and structural distances-should be the purview of syntax and not of morphology.
There is no adequate morphological implementation of the PLC, then, in any meaningful sense of the term.And because the PLC is necessary to derive the kind of hierarchysensitive PCC effects surveyed here, there is no viable account of the PCC that is truly morphological, either.

The sensitivity of the PCC to the overtness of agreement morphology, and the consequences of this sensitivity
As briefly noted in section 3, the PCC is known to be restricted to those environments where overt ϕ-feature agreement with internal arguments is found.(The term overt ϕ-feature agreement is meant descriptively, and collapses ϕ-agreement together with clitic doubling.This distinction, and its consequences, are the topic of section 6.) As an example, consider Hebrew ditransitives.In Hebrew, when the dative argument precedes the accusative(=non-oblique) one, the dative is the hierarchically higher of the two arguments (Landau 1994;Preminger 2006).This is demonstrated in (17), where gender is used to disambiguate the antecedent of the bound reflexive: (17) Hebrew: dat≫acc … Ha-mehapnet-et ta-cig la-cofe et acmo.the-hypnotist-F fut.3sg.F-introduce dat.the-spectator.M acc refl.M 'The (female) hypnotist will introduce the (male) spectator to himself.' (lit.'The (female) hypnotist will introduce [to the (male) spectator] [himself].')Thus, dative-first ditransitives in Hebrew show the same hierarchical order of internal arguments as their Basque counterparts in (3).However, Hebrew lacks overt agreement with internal arguments; accordingly, no PCC effects arise, as shown in (18).(There is, of course, overt ϕ-agreement with the subject in (18); but that is irrelevant to the distribution of PCC effects among multiple internal arguments of, e.g., a ditransitive verb.)This example, demonstrating the PCC, involves a finite clause-an environment which, in Basque, is associated with overt ϕ-agreement morphology.In contrast, non-finite environments in Basque, including nominalizations, exhibit no ϕ-agreement morphology-and in particular, no overt agreement with the internal arguments.Crucially, if we take the same verb with the very same combination of arguments, and place it in such an environment, PCC effects disappear.This can be seen in ( 20), repeated from (15):5 (20) Gaizki irudi-tzen φ-zai-φ-t [ zuk ni wrong look-impf 3.abs-√-sg.abs-1sg.datyou.ergme(abs) harakin-ari sal-tze-a ]. butcher-art.sg.dat sold-nmz-art.sg.(abs) 'It seems wrong to me for you to sell me to the butcher.' Note that this is a fact about overt agreement morphology, not a fact about finiteness.We can see this by comparing Basque to Spanish, for example.Spanish also exhibits overt agreement morphology reflecting the ϕ-features of internal arguments (or more precisely, it has internal argument clitics; see section 6 for further discussion).But unlike Basque, Spanish infinitives retain this agreement morphology.Accordingly, in Spanish, the PCC persists even in infinitives:  It has been claimed that some languages that lack overt agreement with internal arguments-notably, English and Swiss German-nevertheless show PCC effects when only weak pronouns are involved (Bonet 1991;Haspelmath 2004; see also Anagnostopoulou 2008).It is worth noting, however, that the judgments in these languages that purport to differentiate sentences with 1st/2nd person weak-pronoun Themes from their 3rd person counterparts are quite subtle (as is sometimes acknowledged in the literature on this topic).This seems quite distinct from the PCC effects in (19, 21a-b), for example, which give rise to very clear judgments. 6Moreover, a sensitivity to the strong vs. weak pronoun distinction would be somewhat unexpected were the English and Swiss German facts of a piece with the other PCC facts discussed here.The reason is as f ollows: while it is true that in some languages, the use of strong pronouns ameliorates PCC effects that would have arisen with their weak-pronoun counterparts, this amelioration seems to be associated with the addition of a PP, or similar oblique structure, around the pronouns in question.This is overtly discernible in French and Spanish, for example.In languages where there is no such structure around strong pronouns, as in Basque (and Icelandic; see fn. 5), the strong vs. weak distinction does not seem to affect the PCC. 7Now, crucially, both English and Swiss German are like Basque in lacking any discernible oblique structure around strong pronouns.The fact that the relevant effect in English and Swiss German is nevertheless sensitive to the strong vs. weak pronoun distinction suggests that it is of a different sort-namely, something about the licensing conditions on weak pronouns in these languages.I therefore conclude that it is a reasonable move to tentatively exclude these cases from the empirical domain that falls under the heading of "PCC effects" (at least as it is applied to cases like (3b, 21a-b)).
In sum, real PCC effects seem to come and go along with overt ϕ-agreement with internal arguments, both cross-and intra-linguistically. Let us now juxtapose these facts with the results in section 4. As noted earlier, this sensitivity of the PCC to overtness was a central motivation for the view of the PCC as a morphological filter; but the results of section 4 render such an approach untenable.What we have in the PCC, then, is a syntactic effect par excellence, which nevertheless only arises in the presence of overt agreement morphology.This immediately raises the following question: How can something in narrow syntax be sensitive to the overtness of agreement morphology?
As best I can tell, the only possible answer that maintains the modularity of syntax vs. morpho-phonology is that the mechanisms of agreement and dative intervention, which are implicated in the PCC, are only in place when we can see them.To put this another way, there is no such thing as "null agreement."Importantly, this refers to agreement that is null across the entire paradigm; there is of course no prohibition against particular cells being null in what is otherwise an overt paradigm, as the PCC still arises when agreement with internal arguments has some null cells but is otherwise overt.
Thus, the PCC goes away in the absence of overt agreement morphology-e.g., in Hebrew, and in non-finite clauses in Basque-not because it is a morphological filter.(We already saw in section 4 that it cannot be a morphological filter.)It goes away because in the absence of overt agreement morphology, there is simply no agreement there in the syntax.Not even "abstract" agreement.
I will refer to this as the no-null-agreement generalization: 8 (

22) the no-null-agreement generalization
There is no such thing as morpho-phonologically undetectable ϕ-feature agreement.
I label this a generalization rather than a principle, for reasons that will become apparent in section 7.
(i) Gli hai presentato me/noi.3sg.dat(weak) have.2sgintroduced 1sg (strong)/1pl (strong) 'You introduced me/us to him.' Nothing in the current discussion rules out the possibility of phonologically null prepositions, so we may hypothesize that the strong pronouns in an example like (i) come wrapped in covert prepositional structure of this sort.Such a move might seem ad hoc in the context of (i) alone, but in fact there is well-documented microvariation concerning the oblique marking of strong pronouns (as well as Differential Object Marking) across different varieties of Romance (see, e.g., Suñer 1988), which we may then take to be variation precisely in the overtness of the relevant prepositional structure. 8If there is indeed no null agreement, it would constitute another reason why we cannot say that any [ participant] feature, wherever it may occur, must be licensed by agreement (alongside the reasons discussed in section 4; see the discussion of (15) in particular).That is because there are plenty of environments where 1st/2nd person pronouns can appear and not be targeted by overt ϕ-agreement (objects in languages without object agreement; complements of prepositions in languages where prepositions do not agree; or virtually any environment in languages that lack overt ϕ-agreement altogether).If there can be no null agreement, then we cannot say that 1st/2nd person pronouns in such environments are licensed by agreement.

The clitic-doubling caveat
In describing the distribution of PCC effects in section 5, I abstracted away from an important detail: the distinction between ϕ-agreement in the narrow sense, and clitic doubling.To accurately capture the intra-and cross-linguistic distribution of PCC effects, we need clitic doubling (and/or syntactic cliticization) of internal arguments to also count as "overt agreement morphology."To see why this is an issue at all, let us first review the central properties of the two kinds of relations.

Some background on clitic doubling
The term ϕ-agreement refers to a valuation relation between a functional head H 0 and a DP, as the result of which the ϕ-feature values associated with the interpretation of the DP ([participant], [plural], etc.) come to be expressed on H 0 .Agreement morphology that arises in this manner is the spellout of valued features on a functional head.There is therefore no particular reason to expect that the exponents of these features will resemble the free-standing pronouns of the language.9Moreover, it is possible for these exponents to exhibit allomorphy, and even suppletion, based on the (other) features of the head H 0 (see Arregi & Nevins 2008;2012, a.o.).A widespread example of the latter would be the agreement exponents in one tense/aspect configuration differing from those found in another tense/aspect configuration.Clitic doubling refers to the occurrence of a D 0 -like morpheme, which is ϕ-featurematched to the doubled DP, and appears alongside an appropriate host.As such, doubled clitics do not exhibit allomorphy based on the features of their host (Arregi & Nevins 2008;2012).Furthermore, we may expect that at least in some cases, doubled clitics will bear morpho-phonological resemblance to the free-standing pronouns of the language.Note: I restrict the use of the term clitic doubling to those languages and constructions where the full noun phrase is in argument position, and the relation between the clitic and the full noun phrase exhibits at least some properties characteristic of syntactic movement (see Anagnostopoulou 2006; to appear, for a review).
There are several points concerning clitic doubling that merit mention, at this juncture.The first point is that clitic doubling is not, generally speaking, optional; nor is it conditioned by nominal properties like animacy, definiteness, and/or specificity, in the general case.Clitic doubling in (24), for example, is entirely obligatory, irrespective of the properties of the doubled nominals.In languages where clitic doubling appears to be conditioned by such nominal properties-e.g., Porteño Spanish (25a-b)-it is likely not the clitic-doubling operation itself that is sensitive to these properties.Instead, these properties regulate movement of the full noun phrase into a position from which clitic doubling is then both possible and obligatory (Diesing 1992;Uriagereka 1995;Alexiadou & Anagnostopoulou 1997;Sportiche 1998;Suñer 2000;Merchant 2006;Nevins 2011, a.o.).
Porteño Spanish (Suñer 1988 Importantly, these nominal properties (animacy, definiteness, specificity) are known to regulate movement of noun phrases even in languages that lack clitic doubling entirely (cf.Diesing & Jelinek 1993;Diesing 1997, a.o., on Object Shift).Since the possibility already exists for phrasal movement to be sensitive to these properties, it would be redundant to build this sensitivity into the clitic-doubling operation as well (cf.also indiscriminate obligatory clitic doubling, as in ( 24)).
The second point concerning clitic doubling is that the doubled noun phrase is known to behave, for the purposes of locality, like traces of A-movement (Anagnostopoulou 2003, a.o.), which are known to be non-interveners for other ϕ-agreement and A-movement operations (Holmberg & Hróarsdóttir 2003, a.o.).As an example, consider the status of dative noun phrases in Basque.As shown by Preminger (2009), dative noun phrases that have not been clitic-doubled are interveners for ϕ-agreement in Basque-including agreement in number-when attempting to target an absolutive DP for agreement.In (26a) the matrix finite auxiliary can successfully target the embedded absolutive DP for agreement in number.However, when the benefactive PP in (26a) is replaced with a bona fide dative DP, as shown in (26b) (and note that, crucially, there is no clitic doubling of dative DPs across clausal boundaries), the same agreement relation is rendered impossible.
Now contrast this state of affairs with what we saw earlier with monoclausal ditransitives in Basque, e.g. ( 24).There, number agreement with a plural absolutive DP was possible (and, in fact, obligatory), despite the presence of a dative DP.This is not because of the relative positions of the two arguments; the dative argument in ditransitives is systematically higher than the absolutive one (see section 4).The reason number agreement with the absolutive goes through in this case is because the dative DP has been clitic-doubled, rendering it a non-intervener for subsequent probing (note the 1st person clitic da on the auxiliary in (24), vs. the absence of any dative clitic whatsoever on the auxiliary in (26b)). 10he third point regarding clitic doubling concerns its relation to syntactic cliticization (i.e., the occurrence of a clitic that does not seem to double a full noun phrase), such as what we find with object clitics in a language like French.As in the case of sensitivity to nominal properties, here too we can reason by appealing to that which we know is independently necessary.Consider the following: (i) recourse to something like pro is necessary even in languages that lack clitic doubling entirely; and (ii) we already need a mechanism of clitic doubling to account for languages in which (pronounced) noun phrases are doubled by clitics.Given these premises, it would be redundant to assume a separate, third mechanism for (syntactic) cliticization.It can instead be derived directly from these independently-motivated premises, as an instance of clitic doubling of pro.Thus, for example, apparent cliticization of objects in French would amount to clitic doubling of pro (which, in French, would be licensed only under clitic doubling of the object).To reiterate, (i)-(ii) are necessary regardless of what the theory of (syntactic) cliticization is, and together, they are sufficient to derive it.

Clitic doubling and the PCC
Having briefly surveyed the properties of clitic doubling and how it differs from ϕ-agreement in the narrow sense (in creating new antecedents for binding, in its patterns of allomorphy, and in its effects on subsequent syntactic intervention effects), we can now return to the issue that clitic doubling raises for our characterization of the distribution of the PCC.As noted at the outset of section 6, the generalization that the PCC occurs only where overt agreement morphology with internal arguments is found is only correct if the term overt agreement morphology covers both ϕ-agreement and clitic doubling.The question, now, is why this would be so; after all, the results of section 4 demonstrate quite clearly that what is at issue when it comes to the PCC is the mechanisms of ϕ-agreement and intervention in syntax.Why, then, would clitic doubling also suffice to give rise to PCC effects?
One answer that we can dismiss quite easily is that clitic doubling "counts" as ϕ-agreement for the purposes of PCC distribution because clitic doubling is ϕ-agreement.The differences surveyed in section 6.1 show quite clearly that it is not (and see Anagnostopoulou 2006; to appear, and references therein, on further differences between the two phenomena).
Let us assume, then, that clitic doubling is an instance of movement.This is an explicit part of "Big DP" analyses of clitic doubling (see Torrego 1992;Uriagereka 1995;Belletti 2005;Cecchetto 2005;Craenenbroeck & Koppen 2008;Arregi & Nevins 2012, a.o.); several other approaches to clitic doubling include a movement component, as well (see Sportiche 1998;Roberts 2010;Harizanov 2014, a.o.).But why would clitic doubling, qua movement, behave for the purposes of PCC distribution as though it were agreement?A seemingly promising direction involves the idea that all DP movement is prefigured by a corresponding agreement relation in ϕ-features.This view has become very popular in recent syntactic literature, following Chomsky (2000;2001).On this view, DP movement is in fact a two-step process: (27) a two-step approach to DP movement (Chomsky 2000;2001) ① H 0 enters into an Agree relation in ϕ-features with the DP α → and subsequently/consequently: ② α moves to the domain of H 0 (=[Spec,HP]) HP H' If this were how DP movement always worked, it could explain why clitic doubling "counts" as agreement as far as the PCC is concerned.The explanation would go as follows: clitic doubling, being a movement operation of the relevant kind, invariably requires a prior agreement step; and it is this agreement step that is relevant to the PCC (in the manner described in section 4).
It has become clear, however, that ( 27) is incorrect-at least as a general requirement.While some instances of DP movement (or A-movement more generally) obey ( 27), there are other instances that do not.The reader is referred to Preminger (2014: 157-175) for a more comprehensive discussion of the issues, but I will mention one clear counterexample to (27) here.In Icelandic, there are double-dissociations between subjecthood and nominative case.In (28), for example, the subject (einhverjum stúdent 'some student.sg.dat') is non-nominative, and the nominative (tölvurnar 'computers.the.pl.nom') is a non-subject.
(See Andrews 1976;Thráinsson 1979;Zaenen, Maling & Thráinsson 1985;Sigurðsson 1989;Harley 1995;Jónsson 1996, among many others, for arguments that this is indeed the correct analysis of an example like (28).) ( 28) Icelandic (Holmberg & Hróarsdóttir 2003: 999) [Einhverjum stúdent] i finnast t i tölvurnar ljótar.some student.sg.dat find.plcomputers.the.pl.nom ugly 'Some student finds the computers ugly.' Of particular interest here is A-movement of the dative einhverjum stúdent ('some student.sg.dat') to subject position.This phrase is not the target of any overt ϕ-agreement in (28); overt ϕ-agreement is controlled by the nominative. 11The standard response to these facts is an appeal to abstract ϕ-agreement, in an attempt to salvage (27).The idea is that a dative subject like einhverjum student ('some student.sg.dat') is still agreed with, as a precursor to its ultimate movement to subject position, but this ϕ-agreement just happens to lack any morpho-phonological content.This, however, is untenable: we have already seen, in section 5, that there can be no such thing as truly abstract (i.e., morphophonologically undetectable) ϕ-agreement.
It is worth noting that Icelandic is perfectly well-behaved with respect to those properties that were the topic of section 5.The PCC covaries with the presence of internal-argument ϕ-agreement, and Icelandic does not have internal-argument PCC effects (e.g., in ditransitives).Thus, we would not expect it to have ϕ-agreement with datives, overt or otherwise.There are indeed person restrictions reminiscent of the PCC in Icelandic, but they crucially affect only the nominative argument (see Sigurðsson & Holmberg 2008 for details).This is exactly as one would expect if only the nominative is ever targeted for ϕ-agreement of any kind, and the dative can only serve as an intervener, never as an agreement target.(See Preminger 2014: 129-170 for an independent argument that datives in Icelandic cannot be targeted by ϕ-probes, and instead cause the cessation of probing.) Sentences like (28) therefore provide evidence against the idea that all A-movement (or all DP movement) involves a prior ϕ-agreement relation.Thus, the argument that clitic doubling gives rise to PCC effects because it is DP movement, and all DP movement involves ϕ-agreement, fails.
What we are in search of, then, is a reason why clitic doubling constructions-in contrast to movement in the general case-necessarily involve syntactic ϕ-agreement (which then gives rise to PCC effects).

Clitic doubling as long head movement
I follow Rezac (2008a), Roberts (2010), and others, in assuming that clitic doubling is an instance of head movement-specifically, head movement of D.Moreover, this instance of head movement is non-local, in the sense that it "skips" at least one c-commanding head in its path, thus violating Travis' (1984) Head Movement Constraint (HMC).To see this, let us consider what it would look like if clitic doubling did comply with the HMC.Because the clitic originates in the complement position of the lexical verb, HMC-compliant head movement cannot alter the basic constituent structure given in (29).(Notation: "√/V"=the lexical verb root; "D cl " =the clitic.): (29) {auxiliary/tense/aspect/finiteness, {transitivity/voice, {D cl , √/V}}} Of course, various elements of ( 29) may be null in a particular construction or throughout a given language; but if they are overt, ( 29) is the constituent structure predicted by the HMC.
(According to the HMC, heads can only move into the immediately c-commanding head position, and there can be no excorporation-i.e., a subsequent head-movement step would have to pied-pipe the entire complex constituent formed by the previous head-movement step.)What we actually find, however, does not match this predicted constituent structure: 30) is an instance of (syntactic) cliticization, rather than clitic doubling per sebut recall that cliticization is assumed to simply be clitic doubling of pro (see section 6.1).The constituent structure of an example like (30) is the following (cf.(29)): (31) {{D cl , auxiliary/tense/aspect/finiteness}, {√/V(, t d )}} Thus, clitic doubling (or cliticization) viewed as head movement is movement of D at least as far as v (hence necessarily skipping over √/V), and often further still.In (30)/(31), for example, we see movement of D to T, skipping over √/V as well as v, and possibly other heads (e.g., Asp), too.
The question is not, then, how (30)/(31) could be so given the HMC, since the HMC is not inviolable to begin with.The question is why the HMC is often true, and what it is about cases like (30)/(31) (as well as (32a-b)) that allows them to violate it.12I return to these questions in section 7.5, below.
In the meantime, what we can glean from (30)/( 31) is that clitic doubling-including instances of syntactic cliticization-has the following general structure (though the landing site can be higher than v 0 , of course):

The "double-pronunciation" problem
An immediate question raised by (34) concerns why D 0 is pronounced twice: in cases of clitic doubling, the full noun phrase need not (and often, cannot) be pronounced without its determiner also being pronounced in situ (example repeated from (23d)): ( If clitic doubling is movement of D 0 , we might expect it to be pronounced only at its landing site, as is the case with many other instances of movement (including head movement).
To begin to understand the reasons for this "double pronunciation" of D 0 , let us first note that the form of D 0 when it is a clitic is not always identical to its form when it is a determiner.Much of the early generative work on clitic doubling centered on varieties of Romance, where the two series of forms bear an overwhelming similarity to one another (though there are some instances where Romance determiners do differ from the corresponding clitic form).On the other side of the spectrum, Basque clitics-while similar to the free-standing pronouns in the language-are not particularly similar in form to Basque determiners (see Preminger 2009 and references therein).This is what we might expect if these are indeed two instances of the same element, D 0 , which occur in two different morphological contexts: as the head of the extended nominal projection, and as a clitic adjoined to a verbal head.A particular morphological context may, in a given language, give rise to allomorphy or even suppletion.This is par for the course, e.g., in the relation between finite verb forms and their infinitival counterparts.If finite T 0 (or some feature borne by it, e.g., [past]) is a trigger for contextual allomorphy of a given verb root, then the relation between the finite and non-finite forms of that verb will be irregular.But if, for a given verb, T 0 and its features are not an allomorphy trigger, the relation between the finite and non-finite forms of that verb will be morpho-phonologically transparent.
On this view, the hosts of Basque clitics (usually finite auxiliaries, but also a small number of verbs able to carry finite inflection) trigger a great deal of contextual allomorphy in the form of D, resulting in significant differences between the form of pronominal clitics and the corresponding determiners.Crucially, though, this is allomorphy based on the identity of the host (viz.the finite auxiliary), and not based on sub-features of the host (e.g., indicative mood vs. potential mood).Basque clitics indeed show no allomorphy of the latter kind.Thus, this does not conflict with the observations of Arregi & Nevins (2008); Nevins (2011); Arregi & Nevins (2012), noted in section 6.1, concerning the "tense-invariance" of morphosyntactic clitics.
The hosts of Romance clitics, on the other hand, trigger very little allomorphy of this sort, resulting in a high degree of similarity between the form of clitics and the form of the corresponding determiners.This does not yet account for the double-pronunciation phenomenon; but if we accept the parallel drawn here between finite T 0 as an allomorphy trigger and clitic hosts as allomorphy triggers, we may ask whether double-pronunciation phenomena of the sort we find in the latter empirical domain are also found in the former empirical domain.And the answer is that they are.In what follows, I will consider Landau's (2006) results concerning the fronting of verbs (and verb phrases) in modern Hebrew.Landau (2006) discusses instances of topicalization in modern Hebrew in which verbs may be fronted, with or without their arguments-henceforth, predicate clefts.(To be precise, certain right-adjoined modifiers may also be fronted in this construction; see Landau 2006: 38n9.)Representative examples are given in (36): Hebrew (Landau 2006: 37) a. Li-rkod, Gil lo yi-rkod ba-xayim.inf-dance, Gil neg fut.3sgM-dance in.the-life 'As for dancing, Gil will never do so.' b.Li-knot et ha-prax-im, hi kant-a.
inf-buy acc the-flower-pl, she past.buy-3sgF 'As for buying the flowers, she has done so.' What is crucial for our present purposes is that, in examples like these, the verb stem (-rkod 'dance' in (36a), -knot 'buy' in (36b)) is pronounced twice.Landau's analysis of this instance of double pronunciation is based on the idea that the pronunciation or omission of each copy in a movement chain is negotiated at PF, and in a highly local manner.He assumes there are two different phonological requirements at play: one demanding a host for the affixes associated with T 0 , and one demanding that the left edge of the intonational contour associated with predicate clefting in Hebrew be anchored by the fronted verb.These different phonological requirements each force the pronunciation of a particular copy, resulting in the double-pronunciation effect seen in (36a-b).
I would like to propose a slightly different analysis of facts like (36a-b), one which also extends to the double pronunciation of D 0 in clitic doubling contexts.The analysis is in some sense inspired by the work of Nunes (2004); Bošković & Nunes (2007), and, in particular, by their focus on the mechanics of phonological chain reduction (i.e., the suppression of pronunciation of some copies in a movement chain) as the key to understanding doubling phenomena.Their leading idea is that in order to apply phonological chain reduction, the system must first recognize that the different instances of a single syntactic object are indeed copies of one another, and that this recognition can be obscured under certain circumstances.They focus on cases where one of the two copies occurs within a larger morphosyntactic unit, whose internal structure is not accessible to the linearization algorithm.In these cases, PF cannot identify the two instances as copies of the same object, and phonological chain reduction does not apply. 13his cannot be the whole story, though: consider that canonical instances of verb movement to T 0 (e.g., in French) involve morphological merger of the verb with other material, as well-namely, with tense and/or agreement morphology-and yet this does not inhibit phonological reduction of the lower copy (or copies) of the verb in this case.Viewed from this perspective, the question is what sets apart predicate clefts, as well as clitic doubling (viewed as head movement), from more familiar instances of head movement like V/v-to-T.
Instead of this morphology-driven approach, I propose that the conditions on phonological chain reduction of head movement are as follows: (37) conditions on phonological chain reduction of head movement Let X 0 be a head that undergoes movement to Y 0 , and let α be the lower copy of X 0 .α will be phonologically deleted iff either of the following conditions is met: (i) α and Y 0 are not separated by a phasal maximal projection (incl.XP) (ii) X and Y are part of the same extended projection (Grimshaw 2000), and Y 0 α in the surface structure (i.e., no constituent containing α but not Y 0 has undergone subsequent movement to a position above Y 0 ) I readily concede that, even if true, (37) is a rather unwieldy beast.However, let us concentrate first on whether or not it is a correct characterization of the facts, starting with instances of maximally local (i.e., HMC-compliant) head movement.Given condition (37.i), whenever XP is not a phase, reduction will apply.That is because, in maximally local head movement, there is no other maximal projection relevant to (37.i).This is as desired; consider, for example, classic cases of noun incorporation (Baker 1988 et seq.).Here, the complement of V 0 is NP, which is not a phase, and movement proceeds from N 0 to V 0 .This correctly predicts that in this scenario, the lower copy of N 0 will be deleted: The same applies to movement from T 0 to C 0 , and to movement of the verb root (V 0 or √ 0 ) to v 0 .
In the event that XP is a phase, reduction is still predicted to apply, so long as X and Y are part of the same extended projection (in the sense of Grimshaw 2000) and XP has not been moved from [Compl,Y].This is the case, for example, with any instance of v 0 -to-T 0 movement that does not involve predicate clefting: both v 0 and T 0 are part of the extended verbal projection, and vP has not been moved, meaning T 0 still c-commands α (the original position of v 0 ).( 39) TP vP In a language with vP-fronting and v 0 -to-T 0 movement, fronting of the vP means not only a violation of (37.i) (which never holds of v 0 -to-T 0 movement), but also of (37.ii) (since the lower copy of v 0 will be fronted together with the vP, but T 0 will not be).Consequently, neither (37.i) nor (37.ii) holds in this scenario, and the conditions for phonological chain reduction are therefore not met.The result is the double-pronunciation effect, as is the case in modern Hebrew predicate clefting.(See Landau 2006: 46-50 for arguments that the fronted category in modern Hebrew predicate clefting is indeed vP, rather than VP.)In a language that lacks v 0 -to-T 0 movement, on the other hand, (37) is rendered irrelevant (as there is no v 0 -to-T 0 chain to which it could apply).In such a language, vP-fronting will result in pronunciation of the verb only within the fronted verb phrase (as is the case, e.g., in English).
Let us now turn to the following scenario: XP is a phase, and X and Y are not part of the same extended projection.Here, there is no way either (37.i) or (37.ii) could be satisfied.This is so regardless of whether the movement in question is maximally local (i.e., HMC-compliant) or not, since adding another projection between YP and XP would not alter either of the relevant factors.This, I argue, is precisely the state of affairs when it comes to clitic doubling: it is head movement of D 0 , which is part of the extended nominal projection, to a position outside of DP and within the extended projection of the verb (v 0 /T 0 /etc.).Given (37), we predict that clitic doubling will always be just that-doubling-because phonological reduction will never apply to the lower copy (the one contained in DP). 14 14 For cases where different layers of the DP are cliticized using different clitics (Zamparelli 1995), I tentatively suggest that these involve the same head-movement mechanism detailed in the text, but from the outermost structural layer of what are different extended nominal projections.That is, moving the outermost head of a regular DP can result in different morphological spellout than, for example, moving the outermost layer of a partitive expression (assuming that morphology has access to the featural distinctions between the two heads).Cases where cliticization (rather than clitic doubling) is observed would be treated the same, except that the complement of the moving head is pro; see section 6.1.Thanks to a reviewer for raising this issue.
Finally, let us consider a case of non-local head movement in Breton, the structure of which is repeated below: While vP qualifies as a phasal maximal projection situated between Y 0 (=C 0 ) and α(=v 0 ), the two heads are both part of the extended verbal projection, and the c-command relations between them have not been disrupted by subsequent movement.This means that condition (37.ii) is satisfied, and phonological reduction of v 0 is (correctly) predicted to apply. 15et us now return to the theoretical status of the conditions themselves.It would be eminently fair to characterize (37.i-ii) as quite stipulative.I would nevertheless contend that they represent a (modest) step forward in understanding the doubling part of clitic doubling relative to existing accounts, which by and large arrive at this result by brute force.The "Big DP" analysis (Torrego 1992;Uriagereka 1995;Arregi & Nevins 2012, a.o.), for example, is tailored precisely to achieve this desideratum by base generating an already-doubled structure (in which a clitic and the actual to-be-doubled noun phrase form a constituent, from which the clitic subsequently sub-extracts).The same is clearly also the case for true base-generation approaches to clitic doubling (Sportiche 1998, a.o.).While (37) obviously begs for further explanation, it at least captures the behaviors of clitic doubling, predicate clefting, noun incorporation, and more common V/v-to-T-type head movement-an array which neither Landau's (2006) nor Bošković & Nunes' (2007) proposals are able to fully capture.For a proposal that could potentially derive (something like) (37) from more basic properties of syntactic movement and morphological composition, see, among others, Gribanova & Harizanov (2016).

An A-over-A-like effect blocking head movement
Combining the results of section 7.1 and section 7.2, we have in place the essential ingredients of a theory of clitic doubling as long head movement of D 0 out of its containing DP.A sample derivation-in this case, with v 0 serving as the landing site-is repeated below: We are now finally in a position to address the central goal of section 6, namely, answering why it is that clitic doubling "counts" as syntactic ϕ-agreement for the purposes of the PCC.Here, I build on proposals by Hornstein (2009: 72-74) and Roberts (2010: 33-40).
The central idea is that Bare Phrase Structure (Chomsky 1994) and Iterative Downward Search together yield an A-over-A-like effect, which under normal circumstances precludes head movement altogether.Crucially, however, we will see that this effect abates under particular conditions.Let us begin with the contribution of Bare Phrase Structure (BPS).The aspect of BPS relevant here is its conception of projection and, in particular, the fact that non-terminal levels of projection (previously thought of as "αP" and "α̅ ") are now viewed as additional instances of the very same syntactic object that constitutes the head (previously thought of as "α 0 "): Like many others, I continue to employ the pre-BPS notation for the sake of perspicuity; but the denotatum is a structure that, as far as the grammar is concerned, has the properties characterized on the righthand side of ( 42).Accordingly, the maximal projection ("αP"), for example, cannot be distinguished from the minimal projection ("α 0 ") in featural terms.The two are, by hypothesis, one and the same syntactic object, and it is logically impossible for there to be any featural distinction between an object and itself.
The two can therefore only be distinguished relationally, by inspecting whether a given instance of the object in question dominates and/or is dominated by other instances of the same object (α).
Let us now turn to Iterative Downward Search (IDS).The idea here is that a syntactic probe seeking a viable goal will scan the structure iteratively, using a search algorithm that has at least the following properties (for related ideas, see Kitahara 1994; Takano  1994; Koizumi 1995; Müller 1996; Kitahara 1997; Müller 1998): 16   (43)  adequacy conditions on IDS algorithm a.If y asymmetrically c-commands x, then the algorithm for IDS will encounter y before it encounters x. b.If y asymmetrically dominates x, then the algorithm for IDS will encounter y before it encounters x.
An example of an algorithm that meets (43a-b) is given in (44): 17 16 Definitions: (i) y asymmetrically c-commands x iff y c-commands x and x does not c-command y.
(ii) y asymmetrically dominates x iff y dominates x and x does not dominate y. 17 I assume that there is no actual freedom with respect to the search algorithm employed by the mental grammar.That is, the grammar employs exactly one such algorithm, and what we know about this algorithm is that it meets the conditions in (43a-b).
It might appear that the example algorithm in ( 44) is categorically unable to return a head (an "α 0 ") as its output, since all the non-failing halting conditions (the ones that do not say "no goal") involve returning an "αP."But this is illusory; it is an artifact of the pre-BPS notation used in (44).For example, in step (44f), (44) example of IDS algorithm a.Let P be a syntactic probe, and let XP be P 's sister b. query: Is XP a viable goal?If so, halt with "XP" as the search result c.For every specifier ZP of XP, query: Is ZP a viable goal?If so, halt with "ZP" as the search result d. query: Is XP a phase?If so, halt with no goal e. query: Does X 0 have a complement?If not, halt with no goal f.Return to step (b), using the constituent in [Compl,X] as the new "XP" Let us also make the following assumption considering the search criterion employed in IDS: (45) condition on IDS search criterion The criterion used to determine whether a given node counts as a viable goal for the probe must be featural.
This condition prevents, for example, a search criterion that incorporates relational information (e.g., "x is not dominated by another projection of the same head").
Consider, now, the combination of ( 43b) and ( 45), as well as the consequences of BPS, discussed earlier.Condition (43b) entails that if a head has a projection other than itself, that projection will be encountered before the head.BPS entails that there is no featural basis on which different projections of the same head could be distinguished from one another.And condition (45) states that the criterion for what constitutes a viable goal must be featural.Taken together, the result is that if a head has a projection other than itself, IDS could not possibly yield the head as its search result.In particular, it will never be able to skip a maximal projection but still deem the head of that projection a viable goal, except in the trivial case where the head is the maximal projection.Following Hornstein (2009) and Roberts (2010), I will refer to this as an A-over-A-like locality condition on IDS.But note that there is no appeal here to a sui generis A-over-A principle; the effect is derived directly from the premises stated above.
If true, this locality condition would rule out the theory of clitic doubling sketched earlier in this section, which was based on (long) movement of D 0 alone out of its containing DP.In section 7.5, I will suggest that this condition-like other locality conditions in syntax-is subject to the Principle of Minimal Compliance (Richards 1998;2001), which means that it only holds once for any pair of relata.But before turning to that, I consider another possible response to this state of affairs.

The complementary locality conditions on head movement and phrasal movement
Taken at face value, the A-over-A-like condition identified in section 7.3 seems to rule out head movement altogether (except in the trivial case that the head is also the phrase).For Hornstein (2009), this suggests that head movement might be better modeled as a PF phenomenon, completely outside the purview of syntax.This is a fairly common position concerning head movement (Chomsky 1995;Brody 2000;Abels 2003, among many others), but it is often contested on the grounds that some instances of head movement appear to have interpretive effects (Lechner 2006;Hartman 2011, a.o.).Since it is trivially true that head movement affects pronunciation, if it turns out that it also affects semantic the constituent in [Compl,X] may itself be a head (i.e., non-branching).When the algorithm loops back to step (44b), all it can do is check whether the constituent in question matches the featural search criterion.It cannot determine, using featural means, whether it is an "αP" or an "α 0 " (see the discussion of Bare Phrase Structure, above).Thus, if the head in question matches the featural search criterion, it will be returned as the output of the algorithm.interpretation then it must occur in the part of the derivation that feeds both form and interpretation, i.e., syntax.In what follows, however, I suggest a different reason why we should be skeptical of attempts to remove head movement from the syntactic component.
As noted in section 7.1, Travis' (1984) Head Movement Constraint (HMC) is counterexemplified by several kinds of head movement.Nevertheless, it is beyond question that the HMC often holds; Emonds (1970;1976), Travis (1984)), and others would not have been able to make the observations they made if this were not the case.Let us contrast this with the state of affairs when it comes to phrasal movement.Here, the literature recognizes a condition known as anti-locality (Bošković 1994;Murasugi & Saito 1995;Bošković 1997;Ishii 1997;1999;Saito & Murasugi 1999;Abels 2003;Grohmann 2003;Kayne 2005;Abels 2012).Specifically, there appears to be a ban on phrasal movement that is too local; there is a minimal amount of structural distance that phrasal movement must traverse.For the purposes of the current discussion, I will assume Abels' ( 2003) version of the constraint, which simply bans movement from the complement position of a given head to the specifier of the same projection (though see section 7.5 for a refinement of this proposal):

✓
What is less often noted, however, is that these two locality conditions-on head movement and on phrasal movement-stand in a complementary relation to one another (or near-complementary, once exceptions to the HMC are considered).The picture that emerges is that phrasal movement cannot be maximally local, while head movement (in most cases) must be maximally local.One case where this complementarity is explicitly noted is in the work of Pesetsky & Torrego (2001), who assume that it holds without exception, and build it into their Head Movement Generalization: In other words, (47) states that if H 0 attracts a feature on XP, then XP will move to [Spec,HP]-unless XP is the sister of H 0 , in which case X 0 will head-move to H 0 .This is an idealization, in that (47) entails the strict and invariant HMC, which, we have already seen, is not quite right.And even abstracting away from this issue, (47) does not derive the complementarity in question, it merely asserts it.In the remainder of this paper, I will propose a theory of the locality of head movement and its relation to phrasal movement that derives their (near-)complementarity.What I wish to emphasize here, however, is that if this complementarity of locality conditions is real, it constitutes an argument in and of itself that head movement should remain part of syntax.Modularizing the grammar is vacuous unless different modules make use of different primitives, and access different kinds of information.Phrasal movement clearly resides within syntax proper, since it often has semantic effects as well as phonological ones.It would be quite an odd coincidence, then, if a different operation, situated in a different module (e.g., PF), ended up satisfying complementary conditions to those that phrasal movement satisfies.
Of course, the strength of this argument hinges on the precise nature of this complementarity.If the complementarity is only approximate-as the aforementioned deviations from the HMC might suggest-then the coincidence would be less pronounced, and it would perhaps be less dubious to situate the two types of movement in different modules.I will argue, however, that we can do better.In the next subsection, I will present a theory for the locality of head movement and its interaction with phrasal movement, which, while allowing certain deviations from the HMC, has Abels-style anti-locality as its consequence.Crucially, the theory in question requires a computation that makes reference to both types of movement, and therefore requires them both to reside in the same computational module.

Head movement meets the Principle of Minimal Compliance (PMC)
7.5.1The PMC Richards (1998;2001) argues for a principle that regulates the way the grammar enforces syntactic locality constraints in general.Consider, first, the following pair: (48) a. *[Which book] k did the journalist spread the rumor that the senator wanted to ban t k ?b. ?[Which journalist] i t i spread the rumor that the senator wanted to ban [which book] k ?
One might be tempted to explain the contrast between (48a) and (48b) in terms of whether or not the lower wh-phrase, which book, undergoes movement.The idea would be that (48b) is well-formed because in this example, the wh-phrase in question has not undergone movement out of the Complex NP island.What Richards shows is that such an explanation is, at best, insufficient: (49) Bulgarian (Richards 1998 In contrast to its English counterpart, the pair-list question in (49b) involves overt movement of both wh-phrases, including the one that originates within the Complex NP island, and yet it is well-formed.It is also not the case that Bulgarian simply lacks the Complex NP Constraint.As (49a) illustrates, such movement is illicit in Bulgarian, too, when not accompanied by movement of a second wh-phrase (cf. the English (48a)).
Richards shows that locality conditions such as Subjacency (or whatever subsumes Subjacency as an explanation of the Complex NP Constraint) need only be satisfied once with respect to a given landing site.Once a single Subjacency-compliant wh-chain has terminated in a given CP periphery, subsequent wh-chains landing in the same position are exempt from similar locality conditions.Note that this explanation generalizes to the English data in (48a-b), as well, on the assumption that apparent in situ wh-phrases in English pair-list questions do move, albeit covertly (see also Nissenbaum 2000: 197-201).The converse crucially does not hold: the putative explanation of (48b) based on lack of movement of the second wh-phrase could not possibly generalize to (49b).
Importantly, such interactions are only possible among multiple wh-phrases landing at the same clausal periphery.Violations of island constraints are not ameliorated if they target a CP periphery that is not itself targeted by a separate, island-respecting movement chain.Compare (49b) with ( 50 The same principle also explains why, in pair-list questions in Bulgarian, the two whphrases exhibit standard superiority effects, but in tuple-list questions involving more than two elements, there are no superiority effects among the (n-1) non-highest wh-phrases (see Richards 2001: 282).I will adopt a slight variation on (51) which, as far as I can tell, performs equally well with respect to the Bulgarian data discussed here, but which generalizes more readily to the head-movement scenario that is our current focus (cf.also Richards 2001: 199): (52) Principle of Minimal Compliance (revised version) Once a probe P has successfully targeted a goal G, any other goal G′ that meets the same featural search criterion, and is dominated or c-commanded by G (=dominated by the mother of G), is accessible to subsequent probing by P irrespective of locality conditions.

The A-over-A-like condition meets the PMC
Let us now reconsider head movement in light of the Principle of Minimal Compliance (PMC).Section 7.3 ended with the observation that Bare Phrase Structure (BPS), combined with Iterative Downward Search (IDS)-specifically, (43a) and (43b)-appears to ban head movement altogether.No featural search criterion could possibly be satisfied by a head without also being satisfied by the maximal projection of that same head, and the latter will be encountered by the probe first.That is, if α 0 and αP are distinct, it is αP that must be targeted.This was referred to as the A-over-A-like locality condition on probegoal relations.
If this is a locality condition, then we predict that it would be subject to the PMC.Consequently, it is only the first relation targeting α that should be subject to this A-over-A-like condition.After the condition has been satisfied once, subsequent relations between the same probe and (some projection of) α are, by hypothesis, exempt from it.Therefore, we predict that it should be possible for a probe H 0 to target the head of αP to the exclusion of other material in αP, as long as this is not the first relation initiated by H 0 that targets (some projection of) α.
All of this is not enough to make head movement possible, however.Recall: (i) probes must search for their goals using a featural search criterion (45); (ii) the phrasal node ("XP") is, by hypothesis, featurally identical to the head ("X 0 "); and (iii) the phrasal node is unambiguously closer to the probe than the head is, in terms of the explicit iterative downward search algorithm (IDS) given in (44).So if movement of heads to the exclusion of the rest of their phrases is indeed attested, the impetus for moving the head alone cannot come from the search criterion.It must have a different source.
There is a persistent intuition in the syntactic literature that there is something fundamentally superfluous about phrasal movement.After all, the featural properties of the moving constituent are determined by its head; and so, if movement is a response to the featural needs of some higher attractor, phrasal material outside of the attracted head is not implicated in the mechanism that drives movement in the first place. 18In line with this intuition, I propose the following condition: (53) Minimal Remerge: If X 0 /X min is movable, move only X 0 /X min .
Importantly, the antecedent of the conditional in ( 53) is often false.Without a previous relation in place between the probe and XP, a relation which would adhere to the A-over-A-like condition and satisfy the PMC (52), there would be no way for X 0 /X min to move on its own.The condition in (53) can only wield its influence when a previous relation of this sort is already in place. 19 In sections 7.5.3-7.5.4,I will discuss in detail how (53) interacts with the A-over-A-like locality condition and the PMC, both in local configurations (where the relevant probe is the immediate sister of XP) and in non-local ones (where the probe is more structurally distant).In the meantime, let us note the following: once a probe H has employed a featural search criterion f to target XP, subsequent relations involving f between H and X will necessarily target the head X 0 alone, because of (53).

Locality, c-selection, and anti-locality
The relation that most often plays the role of satisfying the PMC, and thus rendering X 0 movable, is c-selection.In configurations where a head H attracts a feature borne by its 18 Some examples are Chomsky (1995: 262ff.)("The operation Move […] seeks to raise just F [the formal feature being attracted; O.P.]"), and Donati (2006: 29-30) ("Merge just enough material for convergence"). 19A reviewer points out that (53), when combined with the PMC (52), might erroneously predict that an example like the Bulgarian (49b) would be ill-formed.That is because the second, tucking-in instance of wh-movement should be free, by the PMC, to ignore the A-over-A-like condition and instead obey Minimal Remerge (53).That this is not so may ultimately be a matter of pied-piping effects in wh-movement (or whatever underlying mechanism subsumes these apparent effects; see Cable 2007;2010).Alternatively, it may be that for the purposes of the PMC (52), each instance of the A-over-A-like condition is waived only with respect to nodes dominated by the original XP targeted, and not to those c-commanded by it.(This would require complicating the formulation of (52) accordingly, but perhaps the different nature of this locality condition vs. Subjacency-like conditions could be leveraged to derive this distinction).I leave the choice between these possibilities for future work.
complement, the complement XP already stands in a c-selection relation with H (indicated in (54) with a wavy line): (54) Being a relation between H and XP, c-selection satisfies the A-over-A-like locality condition.The PMC then dictates that subsequent relations between H and anything dominated or c-commanded by XP are licit.This means that X 0 is now movable, and Minimal Remerge ( 53) can now wield its influence.And its influence will be to rule out phrasal movement of XP, and to only permit head movement of X 0 : (55) a. H'(=H) The reader will notice that (55a-b) is an anti-locality effect.In particular, it recapitulates Abels ' (2003) version of the constraint, but with one important difference.Abels' version bans movement of an element from [Compl,H] to [Spec,HP], full stop.The system just presented predicts there would be one specific instance in which such movement would be licit, namely, when-unlike in (55)-the constituent in [Compl,H] is non-branching: This is because, on the current view, the effect in (55) arises through the interplay of c-selection, the PMC, and Minimal Remerge.What Minimal Remerge (53) mandates is that the minimal movable projection of an X be the constituent that undergoes movement.In a scenario like (56), where the moving X only has one level of projection in the first place, the effects of Minimal Remerge are vacuous.To put it another way, there is no penalty on moving the entire constituent in [Compl,H] in (56) because there is no smaller projection of X that could have moved.
The question, of course, is whether this deviation from Abels' version of anti-locality is in fact warranted.One consideration that bears on this question involves Matushansky's (2006) theory of head movement.On Matushansky's approach, head movement involves a non-branching constituent undergoing regular syntactic movement into a specifier position, followed by m-merger between this specifier and the adjacent head.In light of this, consider movement of an intransitive V 0 (or root) to v 0 .In this case, the step prior to m-merger is movement of the lower head to [Spec,vP]: 57) is non-branching, this movement would be in violation of Abels' version of the anti-locality constraint: it is movement of the entire [Compl,v] constituent into [Spec,vP]. 20Thus, if we want to maintain Matushansky's approach to head movement, we cannot maintain Abels' version of the constraint.Importantly, however, the version of anti-locality derived here is compatible with such movement (cf.(56), above).

The clitic-doubling caveat: long head movement revisited
Recall that clitic doubling is, by hypothesis, long head movement of D (section 7.1).Given the system laid out in sections 7.5.1-7.5.2, any instance in which a probe H triggers movement of a head X must be prefigured by another syntactic relation between H and XP (the maximal projection of X) involving the same featural search criterion.This is necessary in order to satisfy the PMC with respect to the relevant A-over-A-like locality condition (section 7.3).In cases of maximally local head movement, it was c-selection between H and XP that filled this role.But in cases of long head movement, H and XP do not stand in a sisterhood relation, making c-selection between the two impossible (cf.Chomsky 1994;1995).
I propose that in cases of clitic doubling, the cliticization host H first enters into agreement with the full DP.It is this agreement relation that satisfies the PMC, enabling subsequent movement of the D head, on its own, to H: This explains why it does not matter, for the purposes of PCC distribution, whether a given instance of agreement morphology is an instance of agreement proper (i.e., feature valuation on a functional head), or the result of clitic doubling.That is because clitic dou- 20 We may rightly ask whether there truly are instances of non-branching verb heads of this sort.Unaccusatives certainly wouldn't fit the bill, since they involve an argument base-generated in [Compl,V].However, it has been shown that at least some unergatives are truly intransitive, i.e., lack even so much as an implicit object in [Compl,V] (Preminger 2012).Depending on the analysis of weather predicates, they may constitute another example of a complementless V. Finally, if we take seriously the theory of category-less roots undergoing categorization in syntax, roots of result-nominals would stand in the same configuration as (57) relative to their categorizing n 0 , and these roots are uncontroversially argumentless.
bling still depends on establishing a prior agreement relation of the former kind.Syntactic agreement is therefore implicated in both types of agreement morphology, and as already shown in section 4, it is also the key to understanding PCC effects-especially in light of their sensitivity to finer syntactic hierarchy. 21n immediate question raised by this view of clitic doubling is its status with respect to the no-null-agreement generalization discussed in section 5, and repeated from ( 22): (59) the no-null-agreement generalization There is no such thing as morpho-phonologically undetectable φ-feature agreement.
In many (perhaps, most) cases of clitic doubling, there is no overt morpho-phonological expression of a prior agreement relation.In (60), for example, there is no overt exponence of an agreement relation between the cliticization host (v 0 ) and the object.(The verb displays agreement with the subject, but there is no agreement with the object independent of the clitic.)And this is paradigm-wide, i.e., it is not a matter of the particular ϕ-features of profesor ('professor').
(60) Leísta Spanish (Bleam 1999: 45) Le vi al profesor ayer.cl I.saw a-the professor yesterday 'I saw the professor yesterday.'Instances of clitic doubling do exist where, alongside the clitic itself, one finds overt agreement with the doubled argument-for example, clitic doubling of subjects in certain Northern Italian dialects (Poletto 2000).While this may provide circumstantial support for the idea that clitic doubling is prefigured by syntactic agreement as in (58), it does not change the facts of (60) and many cases like it.
There is no way around the fact that if, as argued above, agreement with the object is implicated in cases like (60), then it is an agreement relation that stands in violation of the no-null-agreement generalization.And this, in turn, means that this generalization cannot be a steadfast, combinatorial principle of grammar.But we cannot abandon (59), either; recall that it was a necessary component of any adequate account of the distribution of PCC effects, given the evidence that the PCC was a fundamentally syntactic effect (sections 4-5).
Are we at an impasse, then?In the next section, I suggest that the answer is no-and that this apparent tension can be resolved by viewing (59) not as a grammatical principle unto itself, but as the outcome of a particular kind of acquisition strategy affecting the placement of unvalued ϕ-features on functional heads.

The nature of the no-null-agreement generalization: A conservative acquisition strategy for unvalued ϕ-features
In earlier sections, we saw evidence in favor of the no-null-agreement generalization, which states that there is generally no such thing as agreement that is morpho-phonologically null across the entire paradigm.Without this generalization, there would be no way to account for the distribution of the PCC, given that the latter is a syntactic effect par excellence (section 4), and yet it comes and goes with the presence of overt agreement morphology (section 5).In section 7, we saw an argument that clitic doubling involves a prior agreement relation between the cliticization host and the full DP of which the clitic is a subpart.Crucially, this agreement relation often goes unexponed, in apparent violation of the aforementioned generalization.
The solution I put forth is to view the aforementioned generalization not as a steadfast principle of grammar, but as the result of a particular kind of acquisition strategy.Before spelling out the strategy in detail, it is worth pointing out that were this generalization a principle of grammar per se, it would raise the same kind of modularity issue discussed in relation to the PCC in section 5. Consider: if agreement is a syntactic operation, then it occupies a part of the grammar where reference to the morpho-phonological content of terminals is impossible.The problem would be even more severe, in fact, because the principle would have to be trans-derivational: it is not the morpho-phonological content of a particular terminal in a particular derivation that is at issue, but rather the fact that some cells in a paradigm must be overt.
Instead, what I propose is that the no-null-agreement generalization, and its exceptions, arise because of how language acquisition proceeds.Specifically, the learner begins with the assumption that there are no unvalued ϕ-features on any functional heads (this includes T 0 and v 0 ).Recall that this does not pose any case-related problems-not even for languages with rich and easily evident case morphology-given the evidence that has accumulated in recent years against structural case being assigned by agreement (Preminger 2014, a.o.; see also section 1).There is then a specific and, crucially, limited set of triggers that would cause the learner to revise this hypothesis, and posit unvalued ϕ-features on a given functional head: 22 (61) triggers for learner to posit unvalued ϕ-features on a head H 0 a. overt morpho-phonological covariance between the exponents of ϕ-features on H 0 and the exponents of ϕ-features on DP b. long-distance head movement (of a D head) to H 0 Crucially, the list in ( 61) is anything but open-ended.It absolutely cannot include, for example, the existence of a binding or fake-indexical relation involving H 0 (cf.Kratzer 2009).If such phenomena were also triggers for positing unvalued ϕ-features on H 0 , a proper account for the distribution of PCC effects would be rendered impossible.Recall that the PCC arises wherever there is overt agreement or clitic doubling (section 5); the presence of binding and/or fake-indexicals in a given construction does not suffice to give 22 Two reviewers independently suggest a scenario that might be useful in further clarifying the relevant notion of overt paradigm.Suppose we had a language where verbs generally exhibited agreement morphology (whether ϕ-agreement or clitic doubling) controlled by internal arguments.And suppose furthermore that, in that language, there were one or more verbs which, exceptionally, did not show this morphology (i.e., their form was constant regardless of the ϕ-features of the internal arguments).On the one hand, we might expect the PCC to abate with these particular verbs (since their own agreement paradigms are, in fact, not overt paradigms).On the other hand, we might expect that the relevant generalization is established per-category (say, v 0 ), not per every root-based allomorph of that category.This is an empirical question, but unfortunately at the time of this writing, I have not been able to find such a case and test it.
rise to PCC effects.The conclusion, already argued elsewhere on independent grounds (Preminger 2013;Preminger & Polinsky 2015), is that phenomena of the latter sort do not involve syntactic agreement in ϕ-features.
This proposal provides us with a "roadmap" for how a language with PCC effects is acquired.(Or, to be more precise, how a construction with a particular inventory of functional projections that ends up generating PCC effects is acquired.)The learner starts with the assumption that there are no unvalued ϕ-features on the relevant functional projection-say, v 0 .Very quickly, however, she will be driven to revise these assumptions, either because v 0 shows morpho-phonologically overt covariance in ϕ-features with the direct object (as is the case in Basque, for example; Arregi & Nevins 2008;Preminger 2009;Arregi & Nevins 2012), or because there is a D associated with the direct object that cliticizes to (i.e., undergoes long head movement to) v 0 , as in Spanish.In the latter case, the learner can deduce with certainty that there must be a prior agreement relation between v 0 and the direct object, for the reasons discussed in section 7.5.
Importantly, misidentifying clitic doubling as "pure" agreement, or vice versa, will be fairly innocuous at this stage, since in either case the learner will end up positing unvalued ϕ-features on the relevant functional head.This is a desirable property: while agreement and clitic doubling are clearly different phenomena, the kind of data that distinguish the two are fairly subtle (see section 6.1 and references therein).It is not unreasonable to assume that, in the course of language acquisition, one may initially be identified as the other; and, in fact, this may be the etiology of one type of language change, wherein pronominal clitics are reanalyzed as markers of agreement (i.e., valuation of formal features on a probe; see, for example, Gelderen 2011).
Either way, once the learner has posited unvalued features on v 0 , the PCC then arises as a direct consequence of agreement and intervention, as discussed in section 4.
On the other hand, the learner acquiring a language that lacks agreement morphology with internal arguments will never be driven to posit unvalued ϕ-features on v 0 .Consequently, as discussed in section 5, the PCC will not arise in such a language.
Finally, let us consider once more the status of pro arguments.Essentially the same acquisition profile obtains here: if the learner encounters agreement morphology on some head H 0 in the verbal projection, but there is no overt argument corresponding to that morphology, she may conclude that this morphology is (i) "pure" agreement (i.e., valued ϕ-features on H 0 ); or (ii) a D head adjoined to H 0 In both cases, the learner will then be driven to posit unvalued ϕ-features on H 0 .This is trivially true in the former scenario; in the latter case, given that D is not part of the extended verbal projection, and that the only mechanism for syntactic cliticization is the one identified here, encountering a D head adjoined to a verbal projection would constitute unambiguous evidence for unvalued ϕ-features on H 0 (in accordance with (61b)).Again, the overall result (regardless of the distinction between "pure" agreement and cliticization/clitic doubling) is the positing of unvalued ϕ-features on H 0 and, consequently, the emergence of PCC effects.

Conclusion
This paper began by surveying some of the evidence that the Person Case Constraint (PCC) is sensitive to the kind of fine-grained hierarchical distinctions that characterize syntax proper.This means that in any system where there is a meaningful modular distinction between morphology and syntax, the PCC is part of the latter module.I then surveyed, in broad strokes, what a syntactic account of the PCC that is capable of deriving this sensitivity would look like (building on Anagnostopoulou 2003;Béjar & Rezac 2003;Anagnostopoulou 2005, a.o.), based on mechanisms of syntactic agreement and intervention.
Next, I turned to the fairly well known fact that the PCC seems coupled to the existence of overt agreement morphology with the arguments involved: as this morphology comes and goes, so does the PCC effect itself.This was shown to hold even intra-linguistically, as demonstrated by the distinction between finite and non-finite environments in Basque.I then juxtaposed this with the earlier results concerning the fundamentally syntactic nature of PCC effects, and its account in terms of agreement and intervention.Assuming that syntax does not make direct reference to the morpho-phonological content of terminals, this led to the conclusion that contexts that do not exhibit the PCC simply lack agreement with internal arguments altogether.I labeled this the no-null-agreement generalization.
An important caveat to this characterization involves clitic doubling: even though clitic doubling is a species of movement (as evinced by its ability to repair Weak Crossover violations), it behaves, for the purposes of the PCC, as though it were agreement.I then showed that we cannot maintain that all movement (or even just all DP movement or A-movement) is prefigured by ϕ-agreement.This therefore cannot be what underpins the clitic-doubling caveat.
I argued that a more promising alternative can be found by investigating the interplay of Bare Phrase Structure (Chomsky 1994), Iterative Downward Search (Kitahara 1994;Takano 1994;Koizumi 1995;Müller 1996;Kitahara 1997;Müller 1998), and the Principle of Minimal Compliance (Richards 1998;2001).In particular, the idea is that movement always "strives" to move only the head, but this is seldom possible because Bare Phrase Structure and Iterative Downward Search together yield an A-over-A-like locality condition that demands that the entire phrase be the target of the syntactic relation.Crucially, however, when the attractor already stands in some prior relation with this phrasal node (e.g., c-selection, agreement), this satisfies the relevant locality condition.It follows, given the Principle of Minimal Compliance, that subsequent syntactic operations involving the same relata need not adhere to the same locality condition again, which is what enables movement of the head alone.
Clitic doubling, qua long head movement, cannot be prefigured by c-selection because it does not involve a sufficiently local configuration (namely, sisterhood).Some other syntactic relation must therefore be what satisfies the A-over-A-like locality condition in this case.I proposed that syntactic agreement is the relation that plays this role.Clitic doubling thus "counts" as syntactic agreement for the purposes of the distribution of PCC effects because it invariably involves an initial agreement step.The agreement involved in clitic doubling, however, seemed to pose a challenge for the no-null-agreement generalization, since in many clitic-doubling languages, there is no morphology indicating valuation appearing alongside the clitic itself.
I then showed how this picture could arise as the result of a conservative acquisition strategy, where the learner does not posit unvalued ϕ-features on functional heads unless and until faced with a particular kind of positive evidence.This type of strategy could give rise to the no-null-agreement generalization, as well as its clitic-doubling caveat.

✗
b l o c k e d b y c l o s e r D A T -D P
Every mother] i accompanied [her i child] k .b. ?*[I mitera tu k ] i sinodhepse [ vp t i (t v ) [to kathe pedhi] k ]. [the mother his].nomaccompanied [the every child].acc'[His k mother] i accompanied [every child] k .'His k mother] i accompanied [every child] k .'As a further example of clitic doubling, consider the Basque sentence in (24): : 207) a. [Kathe mitera] i sinodhepse [ vp t i (t v ) [to pedhi tis i ] ]. [every mother].nomaccompanied [the child hers].acc'[ Paca / a the girl / a the cat 'They listened to Paca / the girl / the cat.' : 396) a. La i oían [a Paca / a la niña / a la gata] i .cl hear.past.3pl a *Kakvo k kazva tozi služitel na [žurnalistite, kojto i [t i razsledvat t k ]], če what tells this official to journalists who investigate that komunistite sa zabludili redaktorite im? communists aux deceived editors their Intended: 'What k does this official tell journalists who i [t i are investigating t k ] that the communists have deceived their editors?Principle of Minimal Compliance (orig.version; Richards 1998: 601) For any dependency D that obeys constraint C, any elements that are relevant for determining whether D obeys C can be ignored for the rest of the derivation for purposes of determining whether any other dependency D′ obeys C.