The standard approach to Non-Distinctiveness, i.e., the “sameness” relation between constituents forming a chain under Copy Theory, involves an indexing mechanism that marks as non-distinct the syntactic objects created by the Copy operation. I argue that Non-Distinctiveness is better explained as an inclusion relation between the features of constituents in a phrase marker. A representational mechanism of chain formation based on this definition is shown to offer analytical and conceptual advantages with respect to wh-copying, non-identical wh-doubling and anti-reconstruction phenomena.
According to
(1) | a. | Cosmo was arrested. |
b. | [TP Cosmo [T’ was [VP arrested |
Since
(2) | a. | K = [TP wasi [VP arrestedj Cosmok]] |
b. | K = [TP wasi [VP arrestedj Cosmok]] | |
L = Cosmok | ||
c. | K = [TP Cosmok [T’ wasi [VP arrestedj Cosmok]] |
As the index allows recognizing both occurrences of
(3) | Indexical-S |
Two constituents α and β are non-distinct if and only if they are assigned the same index/marking through an application of the Copy operation (or any other derivational procedure). |
There are two main theoretical problems with this approach to Non-Distinctiveness. First, it violates the
(4) | Inclusiveness Condition |
Any structure formed by the computation is constituted of elements already present in the lexical items. No new objects are added in the course of computation apart from rearrangements of lexical properties. |
Since indexes (or any kind of markings) are not inherent properties of any lexical item, the condition in (4) bans them.
While many useful proposals in the literature seem to depart from Inclusiveness, non-obeying (4) is particularly significant for the case at hand. Satisfaction of Inclusiveness is supposed to be one of the key advantages of Copy Theory over
Second, Indexical-S involves no real theory of Non-Distinctiveness; it is just a marking mechanism, an inductive device to get the right chains without further complications. In contrast, a true theory of Non-Distinctiveness should be able to explain on independent grounds (i) what kind of elements count as non-distinct for grammar and (ii) what kind of criteria are taken into consideration in such a calculus.
To put it in different terms, once Copy Theory is assumed, Non-Distinctiveness should be regarded as a theoretical problem that is similar to defining the identity conditions on ellipsis, a classic topic in linguistic theory since, at least, Ross (
The structure of the paper is as follows. In section 2, I introduce the premises that allow defining Non-Distinctiveness as an
I pursue a theory of Non-Distinctiveness that relies on certain assumptions on grammatical features. The premise in (5) serves as a starting point for discussing them.
(5) | Syntactic objects are abstract sets of features without any phonological content. |
This is the (generalized)
Notice that the assumption in (5) treats features as the most basic unit of syntactic computation, thus some definitions are in order. I follow Gazdar et al. (
(6) | Valued feature ( |
|
a. | A valued feature is an ordered pair <Att,Val> where | |
b. | Att is drawn from the set of attributes, {A, B, C, D, E, …} | |
c. | and Val is drawn from the set of values, {a, b, …} |
The set of attributes contains classes of features (e.g., Number or Gender), while the set of values contains morphosyntactic properties pertaining to these classes (e.g., singular, plural; or feminine, neuter). If a given lexical item expresses any particular property in a language (e.g., plural), it means it inherently has the corresponding attribute (e.g., Number). Therefore, there are no privative syntactic features under these assumptions; every distinctive behavior between two tokens of a constituent is due to opposing values in certain features.
Following Adger (
(7) | Unvalued feature | |
a. | An unvalued feature is an ordered pair <Att,∅> where | |
b. | Att is drawn from the set of attributes, {A, B, C, D, E, …} | |
c. | and ∅ needs to be replaced with an element from the set of values, {a, b, …} |
For simplicity, syntactic features that do not participate in processes based on valuation will be either replaced with ellipses (…) or represented as values {Val}. For instance, a categorial V-feature may be represented as {V}, and not as an ordered pair <Cat,V>.
As stated in (7c), an unvalued feature <Att,∅> requires replacing the empty set ∅ with a value. I follow Chomsky (
(8) | Activity Condition ( |
A Goal G is accessible for Agree if G has at least one unvalued feature. |
The usual instance of activity/inactivity involves φ-agreement and Case assignment. A DP carrying an unvalued Case feature <Case,∅> is active for φ-agreement, so it may be a Goal for a Probe requiring φ-features. As a consequence of agreement, the Probe values the Case feature of the DP, turning it inactive for further φ-related operations. I take that these mechanisms also hold for left-peripheral features.
(9) | The Activity Condition applies for both A and A’-dependencies. |
From now on, I will use the Greek letters κ and ω to designate activity-features for A and A’-dependencies, respectively. For concreteness, κ is simply an abbreviation for classic abstract Case, while ω is an attribute that allows a constituent carrying a left-peripheral value (e.g., Wh) to be targeted by a Probe in the C-domain.
According to (9), a wh-pronoun like
(10) | Who seems to be happy? |
The derivation of (10) involves four occurrences of
(11) | a. | [CP Who1 [TP |
b. | Who1{<κ, |
A principled definition of Non-Distinctiveness must determine the type of linking principle that binds together the copies of
(12) | a. | V4 = V3 (i.e., {…} = {…}) |
b. | V3 ⊂ V2 (i.e., {…} ⊂ { |
|
c. | V2 ⊂ V1 (i.e., { |
The relations of identity between sets in (12a), and proper inclusion in (12b) and (12c) may be unified as a single type of relation: (improper) inclusion. That is because (i) for every set A identical to a set B, A is a subset of B (i.e., if A = B, then A ⊆ B), and (ii) for every set A that is a proper subset of a set B, A is a subset of B (i.e., if A ⊂ B, then A ⊆ B). In other words, the values of the copies of
(13) | a. | V4 ⊆ V3 (i.e., {…} ⊆ {…}) |
b. | V3 ⊆ V2 (i.e., {…} ⊆ { |
|
c. | V2 ⊆ V1 (i.e., { |
Given my assumptions, such an inclusion relation will arise systematically for every new copy of a constituent. Hence, it may be exploited to define Non-Distinctiveness. Call this definition
(14) | Inclusion-S |
A constituent β is non-distinct from a constituent α if for every value of β there is an identical value in α. |
The definition in (14) is a more formal version of a very intuitive idea: if the features of α contain the morphosyntactic information encoded in the features of β, then β is indistinguishable from (part of) α. Therefore, Inclusion-S involves an asymmetric comparison between two constituents, in which one of them may be underspecified with respect to the other.
Notice that (14) does not introduce any specification of the structural conditions that two constituents must comply to be evaluated as non-distinct. This is undesirable on both empirical and conceptual grounds. In principle, there are two requirements that seem almost ineludible for any two elements forming a movement dependency: (i) they must be in a c-command relation, and (ii) they must be local. By adopting a locality constraint based on
(15) | Two constituents α and β are part of the same chain if | |
a. | α c-commands β, | |
b. | β is non-distinct from α (by Inclusion-S), | |
c. | there is no δ between α and β such as (i) β is non-distinct from δ, or (ii) δ is non-distinct from α. |
These conditions define
A demonstration of the functioning of (15) is in order. Consider once again the passive sentence in (1), repeated for convenience in (16). In this example, both copies of
(16) | a. | Cosmo was arrested. |
b. | [TP Cosmo1 [T’ was [VP arrested |
|
c. | [TP Cosmo1{<κ, |
First,
The next example is the active sentence in (17). It has three occurrences of the constituent
(17) | a. | Cosmo arrested Cosmo. |
b. | [TP Cosmo1 [T’ T [VP |
|
c. | [TP Cosmo1{<κ, |
Here, (i)
A more complex case is posed by the sentence in (18a). It contains four occurrences of the constituent
(18) | a. | Cosmo said that Cosmo was arrested. |
b. | [TP Cosmo1 [T’ T [VP |
|
c. | [TP Cosmo1{<κ, |
This sentence contains the chains CH1 = {Cosmo1, Cosmo2} and CH2 = {Cosmo3, Cosmo4}, both formed in a similar way to the one in (16). Notice that Inclusion-S would make some erroneous predictions in this case if not combined with a locality constraint like (15c). For example, the set of values of
The last example has two A’-dependencies, one in the main clause and the other in the embedded clause, which involve six occurrences of the wh-pronoun
(19) | a. | Who said who was arrested? |
b. | [CP Who1 [C’ C [TP |
|
c. | [CP Who1{<κ, |
This sentence contains two chains CH1 = {who1, who2, who3} and CH2 = {who4, who5, who6}. As already pointed out, the conditions in (15) calculate chain links of two elements, so chains of more than two members must be formed by transitivity. In the case at hand,
There is still one important aspect of this system that should be discussed before exploring further empirical consequences of adopting it. Inclusion-S (cf. (14)) and its associated set of conditions (cf. (15)) rely on a
(20) | Representational characterization of chains ( |
Chains are read off from S-structures (and/or other syntactic levels), hence chain formation is a mechanism independent from “move α”, and in principle chains do not necessarily reflect derivational properties. |
Adopting this characterization has three main consequences in a minimalist and copy-based theoretical setting. First, given that (20) states that narrow syntactic operations (e.g., Copy, Merge) and chain formation apply independently at distinct computational cycles, it becomes necessary advancing an algorithm of chain recognition that makes no use of narrow syntactic devices but exploits representational properties of phrase markers (e.g., features, geometrical relations between nodes). I take that Inclusion-S in (14) together with the conditions in (15) offer such an algorithm.
Second, given that chains are supposed to be computed over a representation, and there are no levels of representation other than the interface levels (
Third, if chains are read off from syntactic representations, then there is no need to define them as linguistic objects existing separately from a phrase marker. That is, chains are nothing more than an abstract relation holding between some nodes in a syntactic structure, a relation that ultimately denotes a set CH. Therefore, the conditions in (15) must be understood as an intensional definition for this set.
As already discussed, the Inclusion-S system computes chains over both interface representations independently and in parallel; such a calculus is based only on information encoded in the phrase marker. Since the Copy operation determines only indirectly the form of chains, there may be “mismatches” on how narrow syntax, PF and LF process movement dependencies. For example, it could be the case that narrow syntax generates a set of copies that is not recognized as a chain in one of the interfaces. Conversely, it could also happen that two non-transformationally related constituents comply with Inclusion-S at the interfaces and, therefore, form a chain.
Scenarios like these are not expected under Indexical-S. This marking mechanism establishes a univocal connection between the Copy operation and chain formation during the derivational procedure itself by assigning indexes to the copies. In other words, transformational procedures and chains are inexorably isomorphic under Indexical-S.
This section argues that the “mismatches” predicted by the Inclusion-S system do occur. For conciseness, the discussion focuses on three phenomena that have been accounted for within Copy Theory: (i) Nunes’ (
As mentioned in the introduction, Copy Theory explains the displacement property of language by assuming that only one member of a chain receives pronunciation. Call this general property of chains
(21) | Uniqueness |
Given a chain CH, only one member of CH is pronounced. |
In a Late Insertion model as the one adopted here, Uniqueness may be regarded as the result of a natural tension between (i) economy-related considerations on the application of Vocabulary Insertion and (ii) the general conditions governing the recoverability of information. That is, the most economical way of pronouncing a chain is applying Vocabulary Insertion to only one of its members.
Even though Uniqueness states a crucial property of movement dependencies under Copy Theory, it is usually regarded as a false generalization. This is because some constructions exhibit more than one overt copy. One of these cases is wh-copying, a phenomenon that has been attested in German, Hindi, Romani, and other languages.
(22)
Wen
who
glaubt
thinks
Hans
Hans
wen
who
Jakob
Jakob
gesehen
seen
hat?
has
‘Who does Hans think Jakob saw?’
(23)
Wen
who
glaubt
thinks
Hans
Hans
dass
that
Jakob
Jakob
gesehen
seen
hat?
has
‘Who does Hans think Jakob saw?’
Given the semantic similarity between these sentences, it has become standard to assume that they have the same syntactic structure. In particular, the wh-copying pattern is typically analyzed as involving the overt realization of a copy of the wh-pronoun that has been generated through successive cyclic movement and occupies the specifier position of an embedded complementizer. The relevant representation for the sentence in (22) is sketched in (24).
(24) | [CP Weni C |
Under Indexical-S (cf. (3)), the three copies of the wh-pronoun
Nunes (
(25) | Nunes’ ( |
|
a. | Chain Reduction (i.e., the operation deleting chain members at PF) is costly. | |
b. | Chain Reduction applies until the structure is linearizable according to the |
|
c. | The LCA “cannot see” inside words. |
According to Nunes, whenever there is a case of multiple copy pronunciation it is because one of the copies has been morphologically reanalyzed as part of a bigger word through an application of
(26) | Fusion ( |
[x α] ∧ [y β] ➔ [x/y α,β] | |
where α and β are features of X and Y. |
Regarding a structure like (24), Nunes proposes that the intermediate copy of
The reanalysis-based account of wh-copying allows deriving two defining properties of the construction. First, given that (by assumption) this morphological reanalysis always affects an embedded complementizer, it follows that a pronoun in its base position cannot be spelled-out in wh-copying constructions. This prediction is borne out. Consider the sentences in (27) and (28). The unacceptability of (28) is due to the presence of an overt occurrence of
(27)
Wen
who
denkst
think
Du
you
wen
who
sie
she
meint
believes
wen
who
Harald
Harald
liebt?
loves
‘Who do you think that she believes that Harald loves?’
(28)
*Wen
who
glaubt
thinks
Hans
Hans
wen
who
Jakob
Jakob
wen
who
gesehen
seen
hat?
has
‘Who does Hans think Jakob saw?’
Second, given that the morphological reanalysis is based on an application of Fusion, a PF operation targeting terminal nodes, it follows that there cannot be cases of multiple copy pronunciation involving full wh-phrases. This prediction also seems to be true. In (29), for example, the wh-phrase
(29)
*Welchen
which
Mann
man
glaubst
believe
Du
you
welchen
which
Mann
man
sie
she
liebt?
loves
‘Which man do you believe that she loves?’
The account of wh-copying based on morphological reanalysis can be straightforwardly implemented under the Inclusion-S system with two conceptual advantages: (i) there is no need to treat the phenomenon as an exception to the Uniqueness principle in (21), and (ii) the explanation does not rely on any specific theory of linearization (e.g., the LCA). Consider once again the representation in (24), repeated for convenience in (30a), this time including the featural content of the occurrences of
(30) | a. | [CP Wen1 C |
b. | [CP Wen1{<κ, |
Narrow syntax generates this phrase marker and delivers it to the interfaces for interpretation. At LF, Inclusion-S generates the chain CHLF = {wen1, wen2, wen3} since (i)
Nevertheless, something different happens at PF. There,
(31) | PF representation after Fusion |
[CP Wen1{<κ, |
According to Inclusion-S, the copies of
The account of wh-copying based on Inclusion-S offers an additional empirical advantage over Nunes’ (
(32) | [XP [Xº Y1{β} X{α}] [YP Y2{β} … ]] |
In a standard case of head movement, Y2 should not be pronounced. This follows in both proposals from forming the chain CH = {Y1, Y2) and pronouncing only the head of the chain, as usual.
Suppose that Fusion applies to Y1 and X after head movement. This would form a node [Y1+X] that carries both features α and β.
(33) | [XP [Y1+X]{α, β} [YP Y2{β} … ]] |
According to Nunes’ system, this sequence of operations should yield the pronunciation of both members of the chain CH = {Y1, Y2}. That is, if Y1 undergoes Fusion, it becomes inaccessible to the LCA. Therefore, the LCA only “sees” the other member of the chain, i.e., Y2. At this point, there is no need to apply Chain Reduction, as the structure is already linearizable.
Under Inclusion-S, Y2 should remain silent. This is due to the fact that [Y1+X] and Y2 form the chain CHPF = {[Y1+X], Y2} since the set of values of Y2 is a subset of the set of values of [Y1+X] (i.e., {β} ⊆ {α, β}), and, as usual, only the head of the chain receives pronunciation.
This second pattern is the one attested in the literature for every scenario in which head movement feeds Fusion. For instance, Julien (
(34)
a.
dúnan
guest
yé
dᴐlᴐ
millet.beer
mìn
drink
‘the guest drank millet beer.’
b.
ń
my
mùso
wife
má
jɛgɛ
fish
fèere
buy
‘my wife did not buy fish’
If tense and polarity are distinct heads, a transformational derivation must have caused them to end up in a single syntactic terminal. This derivation is the one already sketched in (32) and (33), i.e., the Polarity head moves to Tense, and Fusion combines them. As (34) shows, such a derivation is supposed to proceed in exactly the same way as predicted by Inclusion-S, i.e., the lowest occurrence of the Polarity head must remain silent. As already discussed, this pattern does not follow from Nunes’ system since it predicts multiple copy pronunciation every time a moved element undergoes Fusion.
To sum up, it has been shown that an account of wh-copying based on morphological reanalysis does not require positing any additional assumptions under Inclusion-S. That is, applying Fusion on an intermediate copy entails the formation of more than one chain at PF. Moreover, unlike Nunes’ (
While there is certain consensus that the proper analysis of wh-copying constructions involves a movement dependency in narrow syntax, there is a much bigger controversy around the pattern exemplified in (35). The obvious difference between this pattern and the one discussed in the previous section is that in this case both wh-pronouns are not the same.
(35)
Was
what
denkst
think
Du
you
wen
who
sie
she
gesehen
seen
hat?
has
‘Who do you think that she has seen?’
This phenomenon is known as
There are two main types of analysis for this phenomenon. The first one postulates that there is a
According to the second type of analysis, the wh-element in the matrix clause is related to the whole embedded CP, so there is only an
Consider now the patterns of non-identical wh-doubling in Dutch varieties reported by Barbiers et al. (
(36)
Wat
what
denk
think
je
you
wie
who
ik
I
gezien
seen
heb?
have
‘Who do you think I saw?’
(37)
Wie
who
denk
think
je
you
die
ik
I
gezien
seen
heb?
have
‘Who do you think I saw?’
(38)
Wat
what
denk
think
je
you
die
ik
I
gezien
seen
heb?
have
‘Who do you think I saw?’
The sentences in (36), (37) and (38) display the orders in which the pronouns can appear in these constructions, i.e.,
(39)
a.
*Wie
who
denk
think
je
you
wat
what
ik
I
gezien
seen
heb?
have
b.
*Die
denk
think
je
you
wie
who
ik
I
gezien
seen
heb?
have
c.
*Die
denk
think
je
you
wat
what
ik
I
gezien
seen
heb?
have
What is particularly interesting about these data is that offering a unified account for them in terms of any of the theoretical alternatives in the literature seems quite difficult. In principle, positing that the left-peripheral wh-pronouns in (36), (37), and (38) are expletives, in line with McDaniel (
Barbiers et al. (
(40) | [die |
The second ingredient of the analysis is an operation that allows creating movement dependencies in which only a subpart of a constituent moves. Barbiers et al. (
(41) | a. | K = [XP X … [YP … [die |
b. | K = [XP X … [YP … [die |
|
L = [wie φ [wat Q]] | ||
c. | K = [XP [wie φ [wat Q]] [X’ X … [YP … [die |
For convenience, I reformulate the analysis in (40) and the Partial Copying operation in (41) in more traditional terms. That is, I propose that the pronouns
(42) | a. | wat = { |
b. | wie = { |
|
c. | die = { |
Accordingly, Partial Copying should be understood as an instance of the Copy operation targeting a proper subset of the features of a constituent, i.e., feature movement in the sense of Hiemstra (
(43) | a. | K = [XP X … [YP … { |
b. | K = [XP X … [YP … { |
|
L = { |
||
c. | [XP { |
Given that the sets of features {
(44) | [XP wie [X’ X … [YP … die … ]]] |
Consider now the analysis of (36). The relevant movement dependency involves two steps. First, the wh-pronoun
(45) | Analysis of (36) in terms of Partial Copying |
[CP Wat1{ |
The same kind of derivation applies to the sentences in (37) and (38), as represented in (46) and (47), respectively. In (46), the wh-pronoun
(46) | Analysis of (37) in terms of Partial Copying |
[CP Wie1{ |
|
(47) | Analysis of (38) in terms of Partial Copying |
[CP Wat1{ |
This type of analysis allows deriving the restrictions on the distribution of wh-pronouns shown in (39). The unacceptable patterns involve a richer pronoun in a higher position that could not have been generated by copying features from the previous one. Since these representations cannot be derived by applying copy operations, they are predicted to be ungrammatical.
(48) | a. | *[CP wie{ |
(cf. (39a)) |
b. | *[CP die{ |
(cf. (39b)) | |
c. | *[CP die{ |
(cf. (39c)) |
While Barbiers et al. (
(49) | a. | [CP Wati … [CP wiei … [VP |
(cf. (36)) |
b. | CH = {wati, wiei, wiei} | ||
(50) | a. | [CP Wiei … [CP diei … [VP |
(cf. (37)) |
b. | CH = {wiei, diei, diei} | ||
(51) | a. | [CP Wati … [CP diei … [VP |
(cf. (38)) |
b. | CH = {wati, diei, diei} |
Under standard assumptions, these chains should behave exactly in the same way as any chain CH = {XPi, XPi, XPi} consisting of three occurrences of the same constituent. This prediction is not borne out. As discussed, chains are supposed to comply with the Uniqueness property in (21) while, on the contrary, non-identical wh-doubling involves pronouncing two wh-elements. To derive this pattern from a single chain, Barbiers et al. adopt Nunes’ (
In principle, this solution seems attractive as Dutch also displays wh-copying patterns.
(52)
Wie
who
denk
think
je
you
wie
who
ik
I
gezien
seen
heb?
have
‘Who do you think I have seen?’
However, there is an additional property distinguishing non-identical wh-doubling from a regular wh-movement dependency. Moving a wh-pronoun across negation is perfectly possible in Dutch (cf. (53)), while non-identical wh-doubling in the same context is unacceptable (cf. (54)). This asymmetry is unexpected if both sentences involve a non-trivial chain connecting a thematic position in the embedded clause with the specifier position of the matrix clause.
(53)
Wie
who
denk
think
je
you
niet
not
dat
that
zij
she
uitgenodigd
invited
heeft?
has
‘Who don’t you think she has invited?’
(54)
*Wat
what
denk
think
je
you
niet
not
wie
who
zij
her
uitgenodigd
invited
heeft?
has
‘Who don’t you think she has invited?’
Barbiers et al. do not offer an explanation for the contrast between (53) and (54). Instead, they point out that negation also creates an intervention effect in wh-copying constructions in both Dutch (cf. (55)) and German (cf. (56)). As discussed in the previous section, the standard assumption is that wh-copying is a phonological variant of regular long distance wh-movement. Therefore, negation is not supposed to produce any effect in these constructions.
(55)
*Wie
who
denk
think
je
you
niet
not
wie
who
zij
she
uitgenodigd
invited
heeft?
has
‘Who don’t you think she has invited?’
(56)
*Wen
who
glaubst
think
du
you
nicht
not
wen
who
sie
she
liebt?
loves
‘Who don’t you think she loves?’
Since there seems to be no unified analysis allowing to explain negative intervention effects in all these cases, the authors conclude that the contrast between (53) and (54) is not evidence enough to reject an account of non-identical wh-doubling according to which both overt wh-pronouns are members of the same chain.
It must be noticed, however, that many German speakers accept without problem sentences like (57), where wh-copying across negation is attested.
(57)
Wen
who
glaubst
think
du
you
nicht
not
wen
who
sie
she
gesehen
seen
hat?
has
‘Who don’t you think she has seen?’
The only way to account for the otherwise contradictory contrast between (56) and (57) is assuming that there are two alternative derivations allowing to generate wh-copying patterns, one that is sensitive to negative intervention and one that is not.
I propose that the derivation that is not sensitive to negative intervention is the one discussed in the previous section, i.e., these are regular wh-movement dependencies in which an intermediate copy is morphologically reanalyzed as part of an embedded complementizer at PF.
On the other hand, I argue that non-identical wh-doubling constructions and cases of wh-copying that are sensitive to negative intervention can be analyzed in a unified way by combining (i) a derivation based on Partial Copying and (ii) the Inclusion-S system. As already discussed, Partial Copying allows explaining in an elegant way the distribution of wh-pronouns in non-identical wh-doubling constructions. However, under Indexical-S, the operation does not derive straightforwardly (i) the fact that two members of the same chain receive pronunciation and (ii) the negative intervention effect. I contend that these properties find a principled explanation under Inclusion-S.
Consider again the structure in (45), which corresponds to the sentence in (36). This representation is generated in narrow syntax and delivered to the interfaces, where chain formation is calculated according to Inclusion-S in (14) and its associated conditions in (15). At PF,
(58) | Chain formation at the interfaces according to Inclusion-S (cf. (36)) | |
a. | [CP wat1{ |
|
b. | CHPF1 = {wat1}; CHPF2 = {wie2, wie3} | |
c. | CHLF1 = {wat1}; CHLF2 = {wie2, wie3} |
In more explicit terms, Inclusion-S predicts that constituents that are transformationally related through Partial Copying must form distinct chains. That is, this system states that two constituents α and β are non-distinct if (i) α c-commands β and (ii) α contains the information encoded in β. However, Partial Copying systematically creates configurations where the features of the c-commanding element are contained in the lower constituent. Therefore, partial copies are always computed as distinct elements at the interfaces. This is attested once again in the derivations in (46) and (47), which correspond to the sentences in (37) and (38), respectively. In the former, the features of
(59) | Chain formation at the interfaces according to Inclusion-S (cf. (40)) | |
a. | [CP wie1{ |
|
b. | CPF1 = {wie1}; CPF2 = {die2, die3} | |
c. | CLF1 = {wie1}; CLF2 = {die2, die3} | |
(60) | Chain formation at the interfaces according to Inclusion-S (cf. (41)) | |
a. | [CP wat1{ |
|
b. | CHPF1 = {wat1}; CHPF2 = {die2, die3} | |
c. | CHLF1 = {wat1}; CHLF2 = {die2, die3} |
Extending Partial Copying to capture wh-copying patterns is conceptually simple.
(61) | Analysis of (52) in terms of Partial Copying |
[CP Wie1{ |
Once again, applying Partial Copying entails forming more than one chain at the interfaces, i.e., the features of
(62) | Chain formation at the interfaces according to Inclusion-S (cf. (52)) | |
a. | [CP wie1{ |
|
b. | CHPF1 = {wie1}; CHPF2 = {wie2, wie3} | |
c. | CHLF1 = {wie1}; CHLF2 = {wie2, wie3} |
Given that CHPF1 and CHPF2 in (58), (59), (60) and (62) are separate chains at PF, both of them should receive pronunciation independently according to Uniqueness (cf. (21)). That is, there is no need to assume that a morphological reanalysis operation applies in these cases. Instead, Partial Copying entails doubling under Inclusion-S.
Consider now the LF chains CHLF1 and CHLF2 in (58), (59), (60) and (62). In each of these cases, the trivial chain CHLF1 lacks a thematic interpretation, but its only member satisfies the formal requirements of the interrogative complementizer, i.e., it functions as an expletive. On the other hand, the chain CHLF2 does receive a θ-role as one of its members occupies a thematic position; however, none of the wh-elements pertaining to CHLF2 is in a spec-head configuration with C
As usually assumed (e.g.,
(63)
Wen
whom
hat
has
Luise
Luise
wo
where
gesehen?
seen
‘Where did Luise see whom?’
(64) | Wen |
In a similar fashion, the head of each of the wh-chains CHLF2 in (58), (59), (60) and (62) must be licensed by establishing a covert dependency with the interrogative complementizer in the matrix clause. In the representation in (65), for instance, this relation holds between C
(65) | [CP Wat1 |
(cf. (36)) |
As Beck (
(66) | *[SC Wat1 |
(cf. (54)) |
The effect in (66) belongs to a natural class of phenomena together with instances of intervention triggered by negation and some other quantificational elements at LF. For instance, the multiple wh-question in (63) is acceptable as long as there is no negative element as
(67)
a.
*Wen
whom
hat
has
niemand
nobody
wo
where
gesehen?
seen
‘Where did nobody see whom?’
b.
Wen
whom
hat
has
wo
where
niemand
nobody
gesehen?
seen
‘Where did nobody see whom?’
Similar intervention patterns are attested in many languages. For instance, French allows moving a wh-pronoun to the left periphery (cf. (68a)) or interpreting it in-situ (cf. (68b)).
(68)
a.
Qui
whom
as-tu
have-you
vu?
seen
b.
Tu
you
as
have
vu
seen
qui?
whom
‘Whom have you seen?’
However, applying overt wh-movement seems to be the only available option if a negative element appears between the wh-pronoun and the left periphery.
(69)
a.
Qui’est-ce
What
que
that
Jean
Jean
ne
mange
eats
pas?
not
b.
*Jean
Jean
ne
mange
eats
pas
not
quoi?
what
‘What doesn’t John eat?’
Discussing potential accounts of LF intervention effects goes beyond the aims of this paper. My purpose is simply to show that adopting Inclusion-S together with Partial Copying yields configurations in which one of the resulting chains must be licensed through covert dependencies, and that independent phenomena for which these covert dependencies are originally postulated display the same type of negative intervention.
In sum, the distribution patterns of wh-pronouns in non-identical wh-doubling constructions in Dutch is elegantly explained by appealing to Partial Copying, as proposed by Barbiers et al. (
Reconstruction has been an important source of evidence for Copy Theory. Assuming that movement involves two (or more) occurrences of the same constituent allows explaining the unacceptability of sentences like (70) as violations of Condition C.
(70) | a. | *Which argument that Cosmoi is a genius did hei believe? |
b. | *[DP Which argument [CP that Cosmo1 is a genius]] did he1 believe | |
[DP |
There are, however, some cases that do not follow straightforwardly from Copy Theory. For instance, if the complement CP
(71) | [DP Which argument [ADJ that Cosmo1 made]] did he1 believe? |
According to Lebeaux (
(72) | LATAR |
Apparent violations of Condition C follow from the absence of the constituent containing the relevant R-expression in some members of the movement chain. |
LATAR involves assuming that the adjunct containing the violating R-expression appears only in the overt member of the chain. Since the pronoun does not c-command the R-expression it binds, no violation of Condition C may arise.
(73) | [DP Which argument [ADJ that Cosmo1 made]] did he1 believe [DP |
More recently, LATAR has been extended to capture phenomena involving A-movement (
(74) | [DP The claim that Cosmo1 was asleep] seems to him1 to be correct. |
Adapting Lebeaux’s proposal, Takahashi & Hulsey (
(75) | [DP The [NP claim that Cosmo1 was asleep]] seems to him1 to be [DP |
The same kind of analysis may be advanced for sentences such as (76). Here, the grammatical subject is interpreted as the logical subject of the predicate
(76) | [DP His1 picture of the president2] seemed to every man1 to be seen by him2 to be an intrusion. |
This deceptive contradiction may be accounted under LATAR. What is generated near the predicate
(77) | [DP His1 picture of the president2] seemed to [every man]1 [DP |
If this approach to
(78) | a. | CH = {[DP which argument that Cosmo made], [DP which argument]} | (cf. (71)) |
b. | CH = {[DP the claim that Cosmo was asleep], [DP the]} | (cf. (74)) | |
c. | CH = {[DP his picture of the president], [DP his], [DP his], [DP his]} | (cf. (76)) |
Compare the way Indexical-S and Inclusion-S generate these chains. Under Indexical-S (cf. (3)), two (or more) constituents form a chain only if they receive the same index through the Copy operation. Therefore, to explain the differences between members of the same chain in (78) it would be necessary (i) generating two (or more) strictly identical copies with the same index, and then (ii) applying an additional operation on the higher copy to introduce the constituent containing the relevant R-expression. Consider the following sample derivation. A constituent αP is generated in a position where it is c-commanded by a pronoun; since αP does not contain an R-expression, Condition C is respected (cf. (79a)). Later in the derivation, αP moves to a position where it c-commands the pronoun; both copies of αP share the same index (cf. (79b)). As a third step, a βP containing an R-expression is inserted into αP as in (79c). At this point, the pronoun and the R-expression can be correferential, and since both occurrences of αP share the same index, they form a chain.
(79) | a. | [XP X … [YP Pronoun1 … [ZP … [αP α]i ]]] |
b. | [XP [αP α]i [X’ X … [YP Pronoun1 … [ZP … [αP α]i ]]] | |
c. | [XP [αP α [βP |
The derivational step in (79c) corresponds to the operations that are called
As known, cyclicity is a theoretical desideratum in generative syntax since at least Chomsky (
(80) | Extension Condition ( |
Syntactic operations must extend the tree at the root. |
Therefore, if an extensionally equivalent and cyclicity-respecting implementation of LATAR is offered, it should be preferred on conceptual grounds.
The definition of Non-Distinctiveness based on Inclusion-S has two important traits that allow offering a cyclic implementation of LATAR. First, Inclusion-S does not require structural isomorphism between chain members; it only states a condition on their morphosyntactic values. Second, Inclusion-S does not require chain members to be related through the Copy operation.
Consider the cases involving anti-reconstruction in A-movement in (74) and (76). The sentence in (74) is repeated for convenience in (81) with a description of the features of the relevant constituents. A bare determiner Dmin/max is base-generated low down in the structure inside a small clause. This constituent does not carry a full set of valued φ-features as some of these are only inherently valued in the NP domain (e.g., Number, Gender). After T is merged, the Spec,T position is filled with a base-generated full-DP with a complete set of valued φ-features. This DP agrees with T and receives nominative Case.
(81) | [DP The [NP claim that Cosmo1 was asleep]]{<κ, |
The base-generated full-DP and the bare determiner comply with the conditions to form a chain according to Inclusion-S. That is, (i) they are in a c-command relation, (ii) the values of the full-DP contain the values of Dmin/max (i.e., {…} ⊆ {
Consider now the sentence in (76), repeated for convenience in (82). Here, a bare possessive determiner merged low down in the structure undergoes successive cyclic A-movement to a position just below the quantifier
(82) | [DP His1 picture of the president2]{<κ, |
According to Inclusion-S, the full-DP and the copies of the possessive determiner form the chain in (79c).
The sentence in (71), repeated for convenience in (83), involves an additional derivational step. Here, the DP
(83) | [CP [DP Which argument [ADJ that Cosmo1 made]] {<κ,∅>, <ω,Q>, …} [C’ C |
To value its κ-feature, the higher DP probes the structure for a matching Goal. By hypothesis, it looks for an active element matching both κ and ω-features, so the DP
(84) | [CP [DP Which argument [ADJ that Cosmo1 made]] {<κ, |
According to Inclusion-S, these two non-transformationally related DPs form the chain in (79a).
As seen, Inclusion-S allows generating non-isomorphic chains without assuming any countercyclical operation. Moreover, the principles restricting this implementation of LATAR are no different from the ones assumed by Takahashi & Hulsey (
(85) | Only caseless chain members (i.e., “traces” of A-movement) may be Dmin/max. |
Regarding interpretability, Fox (
(86) | Trace Conversion ( |
|
a. | Variable Insertion: (Det) Pred ➔ (Det [Pred λy(y=x)] | |
b. | Determiner Replacement: (Det) [Pred λy(y=x)] ➔ the [Pred λy(y=x)] |
This rule transforms a wh-phrase into a definite description with anaphoric value.
(87) | Anti-reconstruction effects in A’-movement of DPs are restricted to non-arguments of nominal predicates. |
Adopting Trace Conversion also allows ruling an unwanted consequence of base-generating chain members under Inclusion-S. Consider the structure in (88). If a DP as
(88) | [CP [DP Which woman]{<κ, |
This unwanted result is ruled out as the output of applying Trace Conversion would yield the uninterpretable operator-variable dependency in (89). As the pair in (90) shows, lexical identity on the NP is a requisite for a definite expression to receive anaphoric interpretation.
(89) | *Which woman λx. Elaine met the girl x |
(90) | a. | The neighbor of [every comedian]1 always takes advantage of [the comedian]1. |
b. | *The neighbor of [every comedian]1 always takes advantage of [the postman]1. |
The unacceptability of (89) shows that there are semantic mechanisms imposing identity conditions on chain members. Importantly, these mechanisms do not seem to depend on any narrow syntactic device (e.g., the Copy operation). Presumably, some other independently motivated principles also introduce constraints on the properties of unpronounced chain members.
To sum up, Inclusion-S offers a straightforward way of capturing anti-reconstruction effects under LATAR. Moreover, it allows getting rid of countercyclical operations as Late Merger and Wholesale Late Merger, a very welcome result from a conceptual point of view.
Copy Theory is based on the idea that elements forming a chain are non-distinct. In this paper, I offered a definition of the Non-Distinctiveness relation based on the featural content of constituents in a phrase marker: Inclusion-S. According to it, two constituents are non-distinct for the purposes of chain formation if the morphosyntactic properties of one of them constitute a subset of the morphosyntactic properties of the other. This condition is part of a representational algorithm of chain recognition that applies independently and in parallel at both interface levels.
Apart from offering a principled definition of Non-Distinctiveness, Inclusion-S introduces a number of empirical and conceptual advantages over a mere indexing mechanism. As discussed, it allows understanding wh-copying as a phenomenon in which a morphological reanalysis operation affects how chains are computed at PF. That is, LF takes a set of occurrences of a wh-pronoun to form a single chain, while the same elements form two (or more) chains at PF, which derives the doubling pattern.
Something similar has been argued to happen in non-identical wh-doubling constructions in Dutch. In this case, however, both interfaces form two chains from a set of wh-elements. The distribution of the pronouns
Finally, anti-reconstruction phenomena have been used to show that non-isomorphic constituents may be part of the same chain in certain contexts. This follows from Inclusion-S, as it predicts that two elements may form a chain even if they are not derivationally related through the Copy operation.
Although, see Martin & Uriagereka (
There are alternative ways of introducing these indexes. For example, Nunes (
See Nunes (
Numerical indexes on copies are introduced for expository purposes only.
Defining chains in terms of representational conditions as in (15) does not imply a theoretical commitment. An equally valid implementation of Inclusion-S is as an operation based on
This aspect of the system makes it
According to early proposals in the minimalist framework (e.g.,
(i)
[TP T [vP IA<κ,
This configuration posits problems for both the Agree system and Inclusion-S. First, the structure in (i) yields a scenario of
There are, at least, two ways of avoiding these issues while maintaining the assumption that the internal argument moves to receive accusative Case. The first alternative is following Harley (
(ii)
[TP T [VoiceP EA<κ,∅> [Voice’ Voice [vP IA<κ,
(iii)
[TP T [vP EA<κ,∅> [vP IA<κ,
See Brody (
The other important generalization about non-trivial chains is that, in most cases, the head of the chain must be the element that receives pronunciation. For discussion of this generalization and its exceptions in terms of a Late Insertion model, see Muñoz Pérez (
See Kennedy (
The present discussion focuses on German data. For examples from other languages, see Fanselow (
As stated by Halle & Marantz (
An alternative solution is independently advanced by Matushansky (
While I follow Nunes’ implementation based on Fusion for explicitness, it should be noticed that Inclusion-S does not strictly require combining
I am grateful to Jonathan Bobaljik (p.c.) for this observation.
I follow Chomsky (
Barbiers et al. (
See Hiemstra (
I remain agnostic whether this covert dependency involves LF-movement (
Convergent evidence that bare D-heads can occupy the tail position in a movement chain comes from extraction asymmetries in preposition-stranding contexts in English. See Stanton (
An alternative derivation may imply introducing the complement NP in any position above the pronoun
While the configuration in (81) seems to require
There are reasons to believe that
A reviewer observes that if Trace Conversion is an operation that truly replaces the wh-determiner for a definite determiner, then the inclusion relation between the features of the operator and the variable should not hold anymore at LF. There are two alternatives to deal with this issue. The first one is assuming that at some point the relation between the operator and its variable is purely anaphoric, i.e., Trace Conversion transforms a Non-Distinctiveness relation into a bound anaphora relation. The second option is taking Trace Conversion to be an interpretative rule that does not modify the phrase marker. For this alternative, see Fox (
Base-generating constituents in Spec,C may also lead to predict patterns like the ones in (39), at least in an underlying representation. This unwanted result may be avoided if, as Fox (
(i)
[CP wie{Q, φ, …} … [CP wat{Q, …} …]]
(cf. (39a))
These pronouns would form the chain CH = (wie, wat) under Inclusion-S. However, once Trace Conversion transforms the wh-pronoun
Stanton (
For comments and discussion, I am grateful to Andreas Blümel, Andrés Saab, Fernando Carranza, Guadalupe Herrera, Hedde Zeijlstra, Jairo Nunes, Jovana Gajić, Laura Stigliano, Marika Lekakou, Matías Verdecchia, Pablo Zdrojewski, Romina Trebisacce, Viola Schmitt, and three anonymous reviewers. I would especially like to thank to Dennis Ott, Radek Šimík and the audience of the Workshop ‘What Drives Syntactic Computation’ at DGfS 37. This work was supported by a doctoral fellowship granted by CONICET.
The author has no competing interests to declare.