My own suspicion is that a central part of what we call “learning” is actually better understood as the growth of cognitive structures along an internally directed course under the triggering and partially shaping effect of the environment.

  —Noam Chomsky (1980: 3).

1 Introduction

The emergence and consolidation of the field of Evolutionary Linguistics in the last few decades, has brought to the fore with a new impetus the question that mostly worried Ferdinand de Saussure when he took the first steps into the discipline of Linguistics as we know it today: What kind of object language is that it deserves a scientifically specialized attention? The situation is aptly captured by Bolhuis et al. (2014: 1), who claim that there is a general “lack of clarity regarding the language phenotype,” which inevitably leads to a corresponding “lack of clarity regarding its evolutionary origins.” The picture gets even more complicated if one takes into account, for example, Ray Jackendoff’s contention that even among those who share a similar ontological commitment about language—e.g. those who share “the contemporary view of language, which goes beneath the cultural differences among languages” (Jackendoff 2010: 63), a basic mutual understanding is made difficult by the volatility of each researcher’s theoretical biases (see, for a similar statement, Tallerman & Gibson 2012: 15–26). A first rough approximation to this ontological issue may be made based on the recognition of two main axes of disagreement.

On the one hand, opposing the view that Jackendoff refers to as “the contemporary view of language,” for many the socio-cultural dimension of language must still be privileged. According to this view, languages are just external, socially shared codes of sorts, which somehow get accommodated within an a priori uncompromised neural substrate in the early experience of children. Clearly enough, the way brains are organized and learn are taken as obvious determinants of the structure of languages themselves. But the position relies on the premise that there are not naturally a priori expectations concerning how languages are or the extent to which they can vary, beyond that general, language-neutral ones. The reigning Vygotskyan and Piagetian psychological traditions of the first half of the 20th Century can be taken as paradigmatic specimens of this position. For Lev Vygotsky, for example, language belonged to cultural development, a layer of human development different from and dependent on ontogeny proper, which brings about the underlying “natural functions” for the former to take place (Vygostky 1986). Similarly, Jean Piaget signaled the “formal operational stage” as the upper limit of cognitive development proper, and he conceived of language as an aspect of “intellectual development,” a form of knowledge accumulation rooted in cognitive development but different from it (see, for example, Piaget 1962). Vygotsky and Piaget are now venerable historical figures, but their lessons continue to be very influential in many quarters (to wit, see below for the seal of Vygotsky in our own approach). Besides, whether their influence is explicitly recognized or not, the Vygotskyan/Piagetian tradition is clearly kept alive in current approaches of the likes of William Labov, Terrence Deacon, Michael Tomasello, Simon Kirby, or Morten Christiansen and Nick Chater, to cite a few (see, for example, Deacon 1997; Tomasello 2001; Kirby et al. 2007; Labov 2007; Christiansen & Chater 2008). Curiously enough, it has even permeated some relevant representatives of Chomskyan formal linguistics, exemplarily Jan Koster (Koster 2009). The following quotation from Christiansen and Chater aptly epitomizes this stance—but note that other use-based or performance-oriented models (e.g. Hawkins 1994; 2004) would also make suitable points of reference:

Language has adapted through cultural evolution to be easy to learn to produce and understand. Thus, the structure of human language must inevitably be shaped around human learning and processing biases deriving from the structure of thought processes, perceptuo-motor factors, cognitive limitations, and pragmatic constraints. Language is easy for us to learn and use, not because our brains embody knowledge of language, but because language has adapted to our brains. (Christiansen & Chater 2008: 490)

It is important to stress that “language” is used in this fragment in a neo-Saussurean way, i.e. as an object external to the human organism, with which the latter maintains a “symbiotic” relation (the expression is also from Christiansen & Chater 2008, in the same page).1 According to the congenial approach of Kirby and coworkers, biological predispositions for language are not of a linguistic kind, and the form of languages is mostly a function of how cultural transmission works. The neo-Saussurean breath of this current of linguistic thought is particularly transparent in Koster, who basically equates language and words, and claims:

Under any meaningful definition, a language minimally contains words. Words are man-made cultural objects (no matter how many biological constraints there are on possible words) and they do not belong to any particular individual but are external in the sense that they are stored in media, in the memory parts of brains, in dictionaries and other books, etc. In that sense, words are E-elements that belong to a speech community. If I die, the Dutch language will in all likelihood survive. (Koster 2009: 8–9)

According to Koster, language consequently relates to biology just as an object of an “applied biology” of sorts (Koster 2009: 64), in the sense that it obviously results from the application of human natural resources, without however being one such resource in itself. The most prominently shared feature of all these socio-culturally biased views on language is, according to our exegesis, that it grants that speakers are—borrowing Labov’s (2007) terms—efficient “transmitters” and “diffusors” of linguistic conventions (old and new), yet without exerting too much influence on the exact character of said conventions, beyond the minimal encoding requirements that the brain imposes as an all-purpose receptacle, an assumption, by the way, that is perceived as questionable by some researchers in the field of cultural evolution; see Charbonneau (2015; 2016).

The opposite side of this first axis of disagreement is, obviously enough, what Jackendoff refers to as “the contemporary view of language,” or Kirby et al. (2007: 5241) as the “orthodox evolutionary/biolinguistic approach.” Both expressions aptly capture the shift that took place in the mid 20th Century from externalist views of language to an internalist stance like the one brought about by the ground-breaking work of Noam Chomsky (Chomsky 1986). Language is conceived of within this alternative conception as a bona fide organ of the brain, details of which remain refractory to kinds of observation routinely applied as regards other organs (Chomsky 1980), but nevertheless a system to be related with biology as usual.

Anderson and Lightfoot’s (2002) vindication of the language organ, for example, is based on the suggestion that cognition must be generally integrated into physiology (the study of organic functions), and language deciphered as a specific form of cognitive physiology. According to them, this perspective allows progression of linguistics as a branch of biology despite uncertainties regarding the ultimate anatomical bases of the language organ, for there exists no motivation “for a sharp delineation between functional organization that is associated with discrete anatomy and that whose relation to physical implementation is less obvious and discrete” (Anderson & Lightfoot 2002: x). Whatever the situation one actually confronts, physiology seems to provide “an appropriate level of abstraction at which significant generalizations about biological determined function can be stated” (same page; see Gallistel 2010, for a similar stance). An important associated tenet of this “contemporary biolinguistic orthodoxy” is that the external stimuli to which the language organ is reactive to do not constitute an object in itself, or at least not one to which relevant linguistic properties can be coherently ascribed. If any, “E(xternal)-language” shows properties but only derivatively from those of the mind/brain itself (Chomsky 1986). This issue is crucial in regard to our argument, so we shall presently turn to it.

As for the second axis of disagreement, it pertains to the general issue of modularity as debated in the field of psychology, particularly since the publication of Fodor (1983). Language shows for many all the signatures of a well-defined and self-contained form-function unit, parallel to other similar brain specializations (face recognition, motor planning, early vision, etc.). Steven Pinker and Ray Jackendoff are paradigmatic supporters of this position. For them (Jackendoff & Pinker 2005; Pinker and Jackendoff 2005), language comprises a series highly specialized components (phonology, morphology, syntax) with idiosyncratic properties relatively to other component parts of cognition (aspects of semantics being reasonably an exception). No correlates of such language-specificities are consequently expected to exist in other non-human organisms. However, Chomskyan internalists have advanced toward a mainstream position that envisions language as a composite of different brain specializations, jointly laying down an emergent functionality (like, for example, vision), but which may happen to be, individually taken, unspecific as for their linguistic dedication. The trend, the origins of which can be traced back to the influential Hauser et al.’s (2002) paper, is rightly characterized in Boeckx (2012) as a “mosaic” approach to I(nternal)-language. A salutary effect of this approach is that it has invigorated the comparative approach to unprecedented levels of detail (Fitch 2011). An unexpected one is that it paves the way to a convergence of sorts between Chomsky’s internalism and cognitive, externalist-inclined approaches, traditionally opposed to the biolinguistic orthodoxy (see Croft & Cruse 2004). What one now observes is that they lead together (programmatically or not) to the dissolution of a well-delimited concept of language.

But to be fair, the situation is more convoluted regarding both axes than the previous clear-cut distinctions may suggest, for many middle ground positions exist that complicate the picture. For example, some pluralist positions underscore the necessity of telling apart the external and the internal concept of language, granting existence to both their referents and relating them with different kinds of evolutionary dynamics. This position was defended, for example, by Balari & Lorenzo (2013: Chapter 1) for methodological reasons—but discarded afterwards (Balari & Lorenzo 2015a), as well as in Bickerton (2014: Chapter 9), whose diagnostic of the overall situation is summed up in the following passage, which focuses on the particular case of syntax:

In retrospect it seems bizarre that nobody, throughout this debate, proposed a principled and systematic distinction between those parts of syntax that were biologically given and those that had to be acquired through acculturation into one of the many thousands of speech communities. (Bickerton 2014: 274)

The seal of pluralism can also be guessed in some current approaches that try to blend variationist sociolinguistics with Chomskyan formal linguistics—for some representative papers, see Cornips & Corrigan (2005).

Such pluralist stances, however, are somehow on the verge of becoming a bizarre position themselves, for they carry with them the danger of giving breath to a strong “culture/biology” divide (Michel & Moore 1995; Oyama 2000; see also Epstein 2016, for some valuable comments regarding the case of language), which keeps apart the behavioral/psychological dimension of language, on the one hand, and its biologically proper underpinnings, on the other hand, taken as if they were different (somehow mysteriously connected) realms for which a more integrative ontology appears to be a misguided reductionist project. It is obvious enough that such a form of dualism ultimately refers to the classic “mind/body” counterpart, a reason strong enough to be suspicious of such a route, beyond its practical advantages when taken just methodologically. The route is however taken, for example, in Boeckx & Benítez-Burraco (2014), who respect the distinction between languages (external systems of grammatical conventions), on the one side, and their biological underpinnings, on the other side. Consequently, their position runs into a neo-Piagetian stance of sorts (Boeckx 2014), in that biologically speaking, for them language (in the latter sense) boils down to a “readiness” to acquire and use languages (in the former sense), a state of the human brain that does not entail an organ proper beyond the brain itself. Such a position suspiciously looks like a return to a form of dispositional analysis in a new disguise (see Chomsky 1980: 48–49, for a critical appraisal of dispositions).

Things are not less complex regarding the second axis of differentiation. As a matter of fact, in the aftermath of Hauser et al.’s (2002) paper, a new kind of dualism has surreptitiously emerged that tells apart what is considered as properly biolinguistic within the human organism (unique) and what is just biological (shared) (Berwick et al. 2013; Bolhuis et al. 2014; Berwick & Chomsky 2016), pinpointing the former as providing the limits of language as a human faculty. Note that this is a move for which no reflexes are found in the study of other non-human organic units at any level of analysis, maybe due to an anthropocentric bias that seems to show up as soon as we confront the study of certain pinnacles of human cognition. Of course, there is nothing wrong with saying that humans are “special” insofar as they have language, a key characteristic of our species—as rightly pointed out by an anonymous reviewer. In any event, there lingers the danger of reification in postulating such a specific vs. non-specific divide, for biology does not respect these kinds of boundaries: Whatever language happens to be, it is language root and branch; biologically speaking, there cannot exist more linguistic or less linguistic parts of language.

The overall situation is admittedly confusing, one that seems to drive to the stagnation, if not the complete collapse, of the enterprise of disentangling the evolutionary origins of language, and with it, the promise of turning linguistics into a bona fide natural science. Not surprisingly, Hauser et al. (2014) close a recent influential paper on the prospects of evolutionary linguistics with a rather pessimistic note, in which they underscore the poverty of our present state of understanding about the phenotype to be explained, as well as the lack of solid methods and clear evidence as potentially leading to such a stagnation. In any event, the good news for those of a more optimistic temper, is that questions parallel to the ones reviewed up to here—the cause of similar angsts in other fields of biology in the past—have been the target of recent eco-evo-devo (short for ecological evolutionary developmental biology) oriented analyses (Gilbert & Epel 2009) that are paving the way to a clarification of the issues of concern. Eco-evo-devo relies on the foundational idea that the organism and aspects of its environment engage in non-trivial developmental units of action, which may be subject to reliable intergenerational transmission and thus acquire evolutionary import in the long run. It thus offers a very suitable model for the integration of the external and the internal also in the case of human language. The main aim of this paper is to work out such an idea. Its organization is the following. Section 2 is first devoted to disentangle the drawbacks of the external vs. internal divide as traditionally and more recently envisioned within the Chomskyan paradigm. Section 3 then focuses on a developmentally inspired alternative that abolishes the distinction in favour of a new “hybrid” concept of language. Thereafter, section 4 explores the impact of this alternative view, if on track, on the conceptualization of language acquisition as a particular, non-exceptional aspect of organic development. Some concluding remarks close the paper.

2 The internal and the external: A first approximation

Orthodox biolinguistics (in the sense of the previous section) is a deeply internalist enterprise, which as a matter of fact has advanced in the last years to an even more radical position (to wit, see Chomsky 2016; 2017). From a historiographical perspective, the origins of Chomskyan internalism must be traced back to Chomsky’s (1959) early rejection of Skinnerian psychology, the at the time rampant interpretation of the processes leading to the complex inventories of abilities exhibited by organisms under natural or artificial conditions. As it is well known, behaviorism relied on a radical eliminativism of the mental, thus privileging the causal role of the environment on the psychological make-up of organisms, humans included. Chomsky’s demolition of the behaviorist edifice is a well-known story, which has been told many times. The aspect of the story that we want to stress here is that the expected move that followed towards a strong internalist stance has not been rectified throughout the subsequent decades, as one would expect as the paradigm tried to synchronize with biological theory under the umbrella term of “Biolinguistics” (Chomsky 2007a; Boeckx & Grohmann 2012). On the contrary, Chomsky’s internalism is now more radical than ever. So, in reviewing the internalism of mainstream biolinguistics, a distinction is in order between what we will refer to as the “garden variety” and the “new wave” of the internalist stance.

2.1 Garden variety internalism

According to Chomsky’s (1986) review, most theoretical approaches to language until the mid-20th Century and even beyond, have targeted an externalized concept of language (E-language) as its subject matter. Such a concept (and variants thereof) is a common ground that traverses as diverse viewpoints as those of Saussurean and Bloomfieldian linguistics, Skinnerian psychology, or Quinean and Putnamian philosophy—just to cite a few representative examples (Saussure 1916; Bloomfield 1933; Skinner 1957; Putnam 1967; Quine 1970; see also Lewis 1975 and Dummet 1989). In Chomsky’s own words, these approaches “tended to view a language as a collection of actions, or utterances, or linguistics forms (words, sentences) paired with meanings, or as a system of linguistic forms or events” (Chomsky 1986: 19). Crucial to all instances of the resulting E-language concept was that “the construct was understood independently of the properties of the mind/brain” (Chomsky 1986: 20), a feature that according to Chomsky’s critique leads to a flaw in the corresponding approaches: “Questions of truth and falsity do not arise” (Chomsky 1986: 20) concerning such an object, for any grammar whatsoever that correctly characterizes a given E-language would be as good as any other capable of doing the same job. As a matter of fact, the feature is assumed as innocuous by the likes of Quine and Lewis, but according to Chomsky this is just the signal that in treating grammars as arithmetical systems for which no particular concern exists in selecting among extensionally equivalent grammars, their approaches do not qualify as bona fide empirical enterprises. The same conclusion follows, according to Chomsky, from externalism at large: It dubiously allows the leap from descriptively to explanatorily adequate theoretical approaches (Chomsky 1965).

A shift of focus towards an internalized concept of language (I-language) is required by such an aim, which is thus a move not just methodologically motivated for a mere question of division of labor, contrarily to Lohndal and Narita’s (2009) interpretation; see Lassiter (2010) for discussion. According to the internalist trend, an I-language “is some element of the mind of the person who knows the language, acquired by the learner, and used by the speaker-hearer” (Chomsky 1986: 22). Consequently, it is an object of study of the cognitive and brain sciences, which respectively make “statements about structures of the brain formulated at a certain level of abstraction from mechanisms” and further try to “discover the mechanisms that are the physical realization” of said structures (Chomsky 1986: 22–23). Obviously enough, the shift is not trivial, for the internal structures of concern are supposed not to be just mental reflexes of certain external correlates—i.e. “actions, or utterances, or linguistic forms (word, sentences) paired with meanings,” but generative systems that underlie these and an indefinite number of other similar actions, utterances, or forms, with which the person who knows/learns/uses the language has possibly had no prior acquaintance whatsoever. And maybe more important than that, the shift of focus is clearly non-trivial because it allows us to make sense of features of the design of language without direct external reflexes, to begin with the one that Chomsky identifies as its Basic Property: namely, the fact that it “provides an unbounded array of hierarchically structured expressions,” in his own recent formulation (Chomsky 2016: 4). Classical observations from the field of language acquisition, like the fact that learners easily derive the structure-dependent character of many rules of grammar from a linear flow of speech, without particularly informative cues or external aid (Chomsky 1967; Crain & Nakayama 1987), likewise only make sense if referred to the hardwired modus operandi of a generative I-system, instead of to its E-outcomes.

2.2 New wave internalism

Throughout the last decade or so, Chomsky has advanced towards an extreme variant of the internalist stance, which is nicely summarized in Chomsky (2016). In a nutshell, the position boils down to the idea that the natural correlate of the I-language concept must be located even deeper than previously thought in the architectural organization of the human mind/brain, for language is, firstly and above all, a thought-composing device, to which externalization mechanisms—though obviously organism-internal, serve a mere secondary or ancillary purpose. In support of this idea, Chomsky elaborates a number of arguments, which may be summed up as follows.

From an intra-linguistic perspective, to begin with, the Basic Property—as defined above—pertains to the (deep) stratum of language that takes expressions as content-bearing entities (Chomsky 2016: 4). Note that hierarchical structure is the crux of the matter of compositionally complex meanings—with its plethora of inclusion, scope, and other asymmetric relations, but a negligible aspect of the externalized reflexes of the corresponding expressions. Complementarily to this formal argument, a functional one is also based on Chomsky’s (2016: 14) perception that externalization is rarely used: “Most use of language use [sic] by far is never externalized.” In other words, most of our linguistic endeavors are exploratory—private reasoning, planning, rehearsal, and so on, maybe preparatory efforts for those (relatively) rare occasions when we share their content with others.

Chomsky derives two extra arguments in support of the position from the following inter-linguistic observations. Firstly, there is not a unique or exclusive system of exteriorization within the reach of languages—as witnessed by the hundreds of currently attested sign languages used by deaf communities worldwide, which indisputably are capable of expressing the same kinds of complex thoughts (Chomsky 2016: 13–14). Secondly, language variation appears to concentrate on the externalization side—namely, the superficial packaging of thought atoms in morphological units, and morphophonology. If it exists at all—see Chierchia (1998) and Ramchand & Svenonius (2008), variation seems to be negligible in the thought-related side of language (Chomsky 2016: 125).

Finally, Chomsky also draws an argument from an inter-species perspective (Chomsky 2016: 41ff), for language-articulated thought strongly contrasts—or apparently so—with the simplicity of the content of the calls and other kinds of signals employed by non-human organisms. As for the latter, fast and clear means of externalization seem to be the crucial property of the corresponding systems; contrarily, language seems to prioritize means capable of creating complex, sometimes highly convoluted expressions.

This array of arguments is maybe not exhaustive, but it suffices to understand the rationale behind Chomsky’s conclusion that we should keep apart language proper—or language “in the narrow sense,” as in Hauser et al. (2002), from the different systems (morphology > phonology > phonetics) contingently recruited for externalization. As already stressed, the externalization apparatus is “external” relatively to the thought-related, properly linguistic one, yet “internal” to the human organism. Certainly enough, Chomsky has not completely thrown the classic I-language concept out of the window—as correctly pointed out by an anonymous reviewer, but we believe that a suitable characterization of his current thinking on this subject matter is one that differentiates between an “I-externalization,” communication-related component, and an “I-language” proper, thought-related one. Chomsky’s contention is that they are autonomous components of the human mind, the connection between which derives from a recent evolutionary event, which purportedly did not have a significant impact on the latter (Chomsky 2010). The net effect of such a radical re-internalization move is that Chomsky’s biolinguistics looks doomed to belong under a (arguably extant) strict organism-internal branch of biology, to which the study of environment-organism interfaces can be safely put aside of the central concerns of the approach.2

2.3 Internalism: A first-off critique

Before explaining and justifying in detail our own view on this issue, a brief critical note on Chomsky’s new wave internalism is in order. Note that Chomsky’s long-held position is that language is, above any other functional consideration, an “instrument of thought” (Chomsky 2013: 139). A recent formulation of the tenet is again to be found in Chomsky (2016: 13), where he characterizes language as “a kind of ‘language of thought’—and quite possibly the only such LOT.” As a matter of fact, the idea that language is primarily an instrument for thought instead of an instrument for communication, has been part and parcel of Chomskyan linguistics almost from its inception (Chomsky 1968; 1980). In more recent times, an effort has been made by Chomsky himself to relate the view with the likes of Salvador Luria and François Jacob (Chomsky 2010: 55–59). The take-home message is that I-language relates asymmetrically with the interfaces, privileging the semantic interface (Asoulin 2016). The conclusion gives grounds to the more extreme view that language and thought can ultimately be equated and the resulting system kept apart from externalization, a qualitatively different phenomenon. Thus, in many senses, it now looks like that Chomsky’s LOT is just Fodor’s LOT, while Chomsky’s “Externalization” is more aptly seen as the equivalent of Fodor’s “Language (module);” for some recent textual evidence that this interpretation might be on the right track, see Chomsky (2017).

A closer inspection reveals that the conclusion relies on a premise that is not as principled as it is supposed to be. For Chomsky’s main reason for deciding the primacy of thought over externalization is that the Basic Property holds in the case of (hierarchically) structured thoughts, but not in the case of the (linear) arrangement of units in discourse. The idea is explicitly introduced in Chomsky (2013), where said asymmetry is referred to as T, a principle that “holds generally as a principle of UG”:

(T) Order and other arrangements are a peripheral part of language, related solely to externalization at the SM interface, where of course they are necessary. (Chomsky 2013: 36)

However, the introduction of the property of hierarchical structuring as the “basic” one is somehow unprincipled. Within an alternative value system, the linear arrangement of units in externalized utterances could equally be pinpointed as “basic” by one privileging, for example, communication/pragmatics over conceptual-intentional thought. Within such a value system, priority would certainly be given to the morphophonological interface to the sensory-motor systems, as language would be, firstly and above all, an instrument of communication.

An anonymous reviewer brings to our attention that the “secondary” character of linear or serial order in behavior relatively to the “basic” character of hierarchically organized ideas or thought may be referred to neurophysiological considerations like the ones already suggested by Karl Lashley in 1951 (Lashley 1951). We agree. The complexly integrative character of linguistic expressions may be the seal of higher-order cognitive processes that somehow fade in their behavioral reflexes: a “basic” vs. “secondary” distinction of sorts would then follow. It does not follow, however, that such an empirically grounded distinction contains any legitimating criteria regarding the kinds of “new wave” Chomskyan boundaries, or how the resulting mind specializations become ranked and regarded as more or less linguistic; especially because the distinction might not be as empirically well-grounded as it is traditionally assumed (see, for example, Rosselló 2016).

In the absence of such principled criteria, our claim is that the only functional characterization that language deserves is that it “is for” whatever it may serve. This a different position relatively to Chomsky’s non-communicative but nonetheless functional approach: language “is for” something, it has a “proper” function—namely, composing thoughts. Granted that this is one of its commonest uses, but Chomsky’s surreptitiously functionalist stance does not really follow, just like the alternative communicative approach. Avoiding both stances sounds like a more reasonable choice. And as we presently see it, this non-alignment toward functional considerations does not compromise in any relevant sense the advancement of the biolinguistic approach. Quite the contrary: if—as in the case of Chomsky—functionalism just serves as a last resort tactic for grounding the language proper/externalization strong divide, then functionalism is of no use once one becomes persuaded that the divide is not worth preserving anymore. Let’s see why we believe that this is the case.

3 The internal and the external: Rethinking boundaries

The interplay between the internal and the external is business as usual in whatever biological corner you choose. However, when it comes to the realm of cognition, such a statement is seen with some skepticism by many, due to its centrality in different failed or suspicious “naturalist” trends in psychology. The recurrent theme of such trends, according to Fodor’s (1981: 229) characterization, is “that psychology is a branch of biology, hence one must view the organism as embedded in a physical environment. The psychologist’s job is to trace those organism/environment interactions which constitute its behavior.” This is why, also according to Fodor, naturalism slumps onto behaviorism and embraces as central the following set of ideas: (a) mental states are individuated by reference to organism/environment relations; (b) such relations constitute the mental; and (c) consequently, one must abandon the characterization of the mental as formal (i.e. computational) (Fodor 1981: 230).

While the position that we articulate in this section is “naturalist” in Fodor’s most basic sense (“psychology is a branch of biology,” “organisms are embedded in a physical environment”), yet it is one that does not collapse into behaviorism of any conceivable filiation. Rather, the section will hopefully serve as a demonstration that naturalism can easily coexist with a formal/computational view on mind. To such an aim, the section’s main contribution is based on current ideas on how “environmental cue-organismal response” mechanisms work at large and how such dynamics may be applied to the cognitive realm. As we hope to show, this general model, which entails a new, somehow defying “hybrid” concept of language (and of organism), does not defy in any relevant sense the computational mind viewpoint. Let us start by advancing some empirical motivation for the view.

3.1 Agreement morphology: Why is it there?

Our case relates with a kind of apparently simple units that have however proved particularly defying for linguistic theorizing: namely, agreement affixes like, for example, the English suffix –s associated to the ‘3rd person, singular’ of verbs in the present tense. These kinds of items have been the focus of much discussion within the Minimalist Program (Chomsky 1995; and subsequent works), since they clearly challenge Chomsky’s instrumental thesis that language expresses thought in an optimal way: agreement affixes do not contribute to the thought compositionally expressed by phrases and sentences; they just mimic (redundantly) the feature composition of other concurrent units. The point is easy to capture if one thinks of the plural form of verbs in languages like Spanish—with relatively rich agreement paradigms, which does not entail a plurality of the referred kind of event: for example, construyen ‘they build’ (–n, ‘3rd person, plural’) does not entail more than one building event, but more than one builder, which is independently expressed in the subject. So why are these items so conspicuous (Corbett 2009), if they seem to be alien to language as a thought-constructing device?

Chomsky’s answer to this question is intriguing: in a nutshell, agreement morphology is obviously there, but it should not. So, syntax must contain a computational procedure to “eliminate/neutralize” the corresponding non-interpretable/unvalued features, preventing them from reaching the thought-related interpretive interface (Chomsky 2000; 2004). The procedure uses a “probe-goal” mechanism of sorts, in the sense that agreement features search for a target within the structural domain they asymmetrically command. As a way of illustration, let us think of a point in the derivation of an internal expression when a bundle of tense and agreement features merges to a previously composed verbal phrase, to jointly form a hierarchically higher inflectional phrase—inverted commas are intended to represent that at this level of representation expressions do not contain words, but bare “atoms” devoid of phonological form (Chomsky 2016):

(1) {‘present, 3SG’, {‘arrive’, {‘a’, {‘boy’}}}}

The inflectional bundle—the probe—then starts a search for a matching goal, which actually happens to exist within the accessed domain—namely, the phase {‘a’, {‘boy’}}. As a consequence, agreement features are deleted/valued and the expression becomes freed from uninterpretable material:

(2) {‘present, 3SGprobe, {‘arrive’, {‘a’, {‘boy’}}goal }}

Chomsky’s efforts to preserve his optimal LOT thesis are certainly ingenious. However, they are a far cry from explaining why agreement features are there to begin with. If any, the suggestion that they have to be somehow eliminated/neutralized makes their being there even more mysterious. In other words, Chomsky’s recent approaches to the matter appear not to have progressed too much from the early days when he declared these aspects of languages “imperfections” of the linguistic faculty (Chomsky 1995). The same objection rules for other Chomskyan-inspired alternatives: for example, the idea that agreement features are alien to the thought composing process and that they rather belong to late insertion morphological processes within the route to externalization. Not surprisingly, defenders of this view refer to them as “ornamental morphology” (Embick & Noyer 2007). But why should speakers embellish externalized expressions?

Functionalist approaches to language have also tried to make sense of agreement phenomena by associating carriers of agreement features to some practical role, like fixing discourse referents, attenuating the effects of noise in communicative uses of language by creating redundancies, and so on (Barlow & Ferguson 1988). That agreement markers serve to these and surely other purposes seems trivially correct. However, the most reasonable interpretation is that they are able to fulfill whatever role (not some specific role, an important qualification) that fits with what they are. This is again in agreement with the “neutralist” stance advanced in the previous section: it is their being there in the first place what explains their propensity to serve an open array of language-related services—not the other way around; for a congenial approach, see Biberauer & Roberts’ (2017) “Maximize Minimal Means” idea, suggested there as a third factor bias on acquisition.

Attending to such a problematic landscape, Balari and Lorenzo (2015a) tentatively outline the idea that agreement material is so conspicuous in languages because it helps to strengthen and stabilize the computational system that underlies the processing of internal linguistic expressions. The thesis put forward thus boils down to the idea that agreement items are there—in the words of Minelli (2003)—for their “developmental role,” which also according to Minelli is “prior” to any other role whatsoever that units of the corresponding realms may later acquire in their respective organic contexts. Let us explore here the import of this idea of the priority of developmental role with some detail.

Language takes advantage of the powerful computational capabilities of the human brain, which make possible that units within linguistic expressions engage not just in local dependencies—{a {boy}}, but also in long-distance ones—{said1 {that a boy betrayed him} very clearly1}; where boldface and subscripts respectively represent these connections. This corresponds to a high level of complexity according to standard formalizations—details are not relevant at this point. However, according to the same standards, natural languages exhibit a further level of complexity in that expressions are apt to contain units that relate interspersedly—i.e. crossing different long-distance connections, as in {he2 {said1 {that the boy betrayed him2} very clearly1}; superscripts are intended now to express these kinds of connections. In order to process such complex connective patterns, a computational system must incorporate a powerful “working space” component, in which to keep in active memory partially composed expressions until all connective links are resolved. Observing that agreement affixes are typically involved in these kinds of indeterminately distant connections—e.g. a boy1 who no one2 except me knows2 arrives1, Balari and Lorenzo (2015a) claim that said units have the developmental role of eliciting, exciting and guiding the exercise of the working space of the human computational system, until it attains its proper storage capacity. Note that the idea is compatible with the possibility that later on other functions hitchhike on the same items. But contrary to other proposals, the suggestion offers an explanation about why they are there to begin with. Balari and Lorenzo’s take home message is that agreement morphology does not just serve learners to capture correct connective patterns, but that it directly impacts on the organic growth of the very system that serves to learn and to compute these patterns once learned. This is obviously a strong enough claim that asks for empirical justification.

Before taking the perspective of development proper, we want to emphasize first that different linguistic phenomena appear to encourage the idea that agreement relations are commonly held disregarding the kinds of functional motivations routinely associated with them. In cases like these, the most suitable interpretation seems to be that agreement is held just for the sake of agreement itself. It is our contention that this actually corresponds to the, so to speak, primeval condition of agreement—i.e. irrespective of its potential for added functional accommodations. The kinds of cases that we have in mind have been the focus of recent attention under different tags, like “omnivorous” (Nevins 2011) or “failed” (Preminger 2014) agreement.

One such case, commented in Nevins (2011: 949–959), is a pattern in which the verb incorporates an agreement morpheme for number that shows up under the condition that the subject, the object, or both, refer to pluralities. This is the case, for example, of the following verbal phrase in Georgian, in which the plural marker –t does not unequivocally agree with the subject or the object, resulting in “potentially massive ambiguity” (from Nevins 2011: 950; ex. 15):

    1. (3)
    1. Georgian (Nevins 2011)
    1. g‑xedav‑t.
    2. 2OBJ‑saw‑PL
    1. ‘I saw you all; we saw you all; he saw you all; we saw you.’

The Georgian case resembles, as pointed out by an anonymous reviewer, the behavior of “number” in languages, like Halkomelen Salish, in which it is described as a modifier adjoined to the corresponding lexical category, instead of as an inflection item embodied as a functional head (Wiltschko 2008). The contention may thus be made that agreement items developmentally derive from the former condition and eventually attain the latter, thus stabilizing and regularizing their behavior—or not, as witnessed by the Halkomelen Salish data. Such an interpretation appears to be congenial to the developmental data to be presented below.

Another interesting example is the case of the Quichean agent/focus (‘AF’) construction studied by Preminger (2014: 18–22). In this construction, the phrase referring to an agent is focused, so the transitive verb behaves in an intransitive-like manner: it takes just one agreement marker of the absolutive paradigm, which however shows up with a feature composition that does not uniformly correspond with the patient or the agent. How the marker is chosen depends in this construction on a hierarchy that privileges the first and second person over the third, and the plural third person over the singular. As a result, the verb agrees with the agent in some cases like (4a) below, but with the patient in other cases, like (4b) below (from Preminger 2014: 20, ex. 18; note that the phenomenon is not Quichean-specific, but it is also found, e.g. in Hungarian, as documented in Bárány 2015):

    1. (4)
    1. Quichean (Preminger 2014)
    1. a.
    1. ja
    2. FOC
    1. yïn
    2. me
    1. x‑in‑ax‑an
    2. ASP‑1SG.ABS‑hear‑AF
    1. ri
    2. the
    1. achin.
    2. man
    1. ‘It was me that heard the man.’
    1. b.
    1. ja
    2. FOC
    1. ri
    2. the
    1. achin
    2. man
    1. x‑in‑ax‑an
    2. ASP‑1SG.ABS‑hear‑AF
    1. yïn.
    2. me
    1. ‘It was the man that heard me.’

Details of the corresponding constructions are not relevant to the purposes of this paper. We simply find them illustrative of the fact that agreement units do commonly not depend upon identification or disambiguation purposes. They apparently are there just for the sake of agreement itself.

From a developmentally proper perspective, an illustrative case in support of the idea is already offered in Balari and Lorenzo (2015a) based on child German, where the morphology for subject agreement is highly idiosyncratic relatively to the adult German counterpart. Let us extend and work out a little bit the data analyzed there.

Children use in some cases different endings for the same features, while adults do not (upper part of Table 1), and complementarily, the former unify different features in a single ending, where the latter use specialized contrasts (lower part of Table 1).

Children (age ±2;0) Adults
first person, singular first person, singular
-en -e -e
third person, plural plural (except second person)
-en -en

Table 1

Child vs. adult subject agreement in German. Sources: Clahsen (1986); Clahsen & Penke (1992).

Technically speaking, the child system is a “degenerate” and “redundant” one (Edelman & Gally 2001), which makes it of dubious utility for the kinds of adult roles customarily thought of to be the proper function of the system—e.g. identifying discourse referents. A worth stressing observation is that children do not behave randomly in this area: They follow their own (i.e. child specific) regularities. Crucial to Balari and Lorenzo’s main point is that as pinpointed in Bateson & Gluckman (2011), the two properties referred to above—degeneracy and redundancy—are typical of developmental systems at large, so an agreement system exhibiting said properties may also perfectly well subserve the suggested developmental role. In this respect, some extra developmental data from German learners seem to be particularly informative:

  1. the regularities of concern emerge just before children engage in computationally complex constructions like (significantly enough) relativization, which typically widens the distance between long dependent constituents (age 2;6)—see Mills (1985); and
  2. children with so-called “Grammatical Agreement Deficit” (Clahsen 1986; Clahsen et al. 1997; Clahsen & Hansen 1997) concurrently exhibit difficulties with complex computational tasks like verb movement from final to second position, which creates another typical instance of long-distance dependency in German. Interestingly enough, Clahsen & Hansen (1997) report that significant spontaneous recovery in this latter area is observed after intensive training with agreement alone.

Relevant data can also be adduced from a broader cross-linguistic perspective. Going beyond the classical morphological locus of affectation in cases of Specific Language Impairment (SLI)—see, for example, Bortolini et al. (1997) for Italian, or Bedore & Leonard (2001) for Spanish, a large body of research demonstrates that high level computational tasks like question formation are also affected in the same condition—see van der Lely & Battel (2003) for English, or Hamann (2006) for French; see also Vares (2017) for a detailed study of both domains in a SLI population of Spanish speaking children. Balari and Lorenzo’s suggestion points to the conclusion that a causal link exists from the former (morphological) to the latter (syntactic) domain.

Now, what is exactly the moral of this linguistic detour? Two details are worth emphasizing before deriving the desired conclusions: firstly, contrary to other properties of linguistic expressions (for example, structural hierarchy), agreement morphology is conspicuously “visible” (or “overt”) in externalized utterances; and secondly, taking advantage of items overtly present in the adult input, learners construct and rapidly internalize idiosyncratic systems of agreement—as clearly attested in the case of child German reviewed above, which according to our hypothesis they immediately put into a developmental use that targets the computational system in charge of composing internal, hierarchically structured expressions. So, we have equally good reasons to attribute these kinds of units both an external and an internal status, for on the one hand they mimic public models, while on the other hand they self-organize according to a learner-internal logic, and contribute, developmentally speaking, to the constitutive process of the inner machinery underlying complex linguistic computations.

3.2 Redefining boundaries: Enter hybrids

The above conclusion is not however particularly challenging if one adopts a broad biological perspective, for the kinds of cases brought to the fore in the previous section are not, at their corresponding level of organization, very different from other well-known cases in other biological realms. Many such cases are reviewed in Sultan (2015: Chapter 3) under the general category of “environmental cue/organismic response” mechanisms. Here we will make use of just one by way of illustration, which unrelated as it may appear to anything having to do with language, nonetheless offers a good abstract schema for what we think that is going on in the case of linguistic systems of morphological agreement.

The case refers to a particular species of sea slug (Elysia chlorotica), the main peculiarity of which is its brilliant green color. The reason for this exceptional coloration for a sea slug is that Elysia feeds on a yellow-green alga (Vaucheria litorea), from which the sea slug obtains chloroplasts that are incorporated into the cells of its digestive tract. Chloroplasts obtained by Elysia during the juvenile stage allow the animal to photosynthesize as plants do, to fulfill its metabolic needs throughout its 10-month life span. Chloroplasts are not transmitted to offspring (Rumpho et al. 2011; Sultan 2015: 32). So, in this case, biotic material external to the organism is assimilated and put into use as organelles within cells, allowing those animal cells to carry out the photosynthetic function. According to Sultan’s own conclusion: “The animal incorporates its (biotic) environment into its own development in the most profound way possible” (Sultan 2015: 32). The parallels with our own concerns are straightforward. Let us explore them with some detail.

Obviously enough, agreement affixes do not qualify as “biotic” in any meaningful sense; however, this is not particularly problematic, for many other examples from Sultan’s book show that abiotic factors (light, temperature, gases) may also participate in development and function and be assimilated for means similar to biotic counterparts like chloroplasts. Generally speaking, cue/response systems act as mechanisms that allow beneficial plastic developmental, physiological and behavioral adjustments of organisms to proximal environmental conditions (Sultan 2015: 49). Crucially, such mechanisms entail certain perceptual and transduction steps between environmental cues and organismic responses, which exercise a key interface role in the generation of hybrid (environment/organism) dynamics.

As for the case that concerns us here, the associated perceptual and transduction steps are, to begin with, far from well understood at a fine grain level of neurological resolution. But this is business as usual when trying to trace the linguistic functions of the brain at that level of analysis (Poeppel & Embick 2005), as it exemplarily also happens with the system of computation that endows languages with Chomsky’s Basic Property (Dehaene et al. 2015). Notwithstanding, the representational/computational approach historically endorsed by Chomskyan linguistics seems apt to provide an abstract perspective (Chomsky 1980) from which to begin securing some relevant preliminary conclusions that, paradoxically enough, put seriously into question Chomsky’s extreme internalism.

According to the prevailing Chomskyan image of the architecture of language, Morphology is a specialized module within the externalization route (Chomsky 1995: 319ff), corresponding to the point where, so to speak, an exchange takes place between the currency of the system of computation—i.e. bare bundles of abstract features purportedly taken from a universal inventory, and a currency of language-specific stems and affixes associated to phonological representations—which in their turn instruct the sensory-motor systems (Halle & Marantz 1993). Thus, Morphology can be conceptualized as a critical transduction subsystem within an overall cue/response system: it identifies signaled pieces of environmental stimuli as carriers of the kinds of features capable of exciting high-level computational activity—i.e. dependencies held at indeterminate distances, potentially embedded within similar dependences and/or obeying complex crossing patterns. According to the hypothesis put forward in the previous subsection, said environmental stimuli incite and help the computational system grow until it attains a specific complexity level, perhaps one equivalent to that of an enhanced pushdown automaton, according to the standard metric of the Chomsky Hierarchy (see Balari & Lorenzo 2013). The units of concern thus have a role in the developmental economy of the computational system, comparable to chloroplasts in the developmental contributions of photosynthesis in the case of the green sea slug. So, if this hypothetical parallel is on the right track, agreement morphemes are as much part of the linguistic environment (E-language) as part of the computational system (I-language), according to Chomsky’s compartments. In other words, just as the sea slug ingests part of its environment and integrates it in its own physiological function, human brains do the same with salient regularities at the level of speech and integrate them into their own cognitive function, transforming them into items of vocabulary (e.g. agreement morphemes) capable of exciting computational activity. Thus, human brains are “hybrids” in the same extent as sea slugs are hybrids (Sultan 2015: 32–33). We shall elaborate this conceptualization presently. Let us stress before how the suggested approach departs from Chomsky’s most recent internalist musings.

From a Chomskyan perspective, much of our contentions are not particularly defying. The internalist stance seems immune to the challenge, for Morphology, with the rest of the Externalization apparatus, is not part of the faculty of language properly speaking. Chomsky’s main justification is a functional one—as already explained: language is an instrument of thought and Morphology does not contribute to thought composition; it simply mediates in thought externalization. Morphology is just a piece of a language-related but ancillary aspect of language proper. This ultimately functional justification is sometimes accompanied with the evolutionary claim that complex compositional thought (aka language) has had an autonomous history prior to its connection to externalization, which allegedly corresponds to a relatively recent evolutionary event—±50,000 years ago (Chomsky’s 2010 estimate; see also Bolhuis et al. 2014; Chomsky 2017, for the most recent exposition of the idea).

Things however look very differently when adhering to a developmental perspective like the one adopted here. For according to this view, Morphology may not contribute to the online process of constructing an articulated thought, yet it may be crucially implied in the developmental process leading to a mature system capable of such constructive processes. When this developmental viewpoint is complemented with an evolutionary perspective—in other words, when an evo-devo perspective is adopted, then it seems inescapable to conclude that the kind of developmental role we are suggesting in regard to Morphology might have also had a relevant role in the evolution of the human-typical system of computation. While assumedly fallible, the ultimately developmental grounding of our position has been supported by relevant empirical data (see 3.1). We honestly doubt that the ultimately functional grounding of Chomsky’s internalism can be empirically supported. It rather seems doomed to remain a metaphysical claim.

Let us concentrate now in the suggested “hybrid” conceptualization of language. The concept is specifically taken in the sense of James Griesemer, according to whom it refers to a biologically ubiquitous ontological status, participants of which are “material individuals produced from several provenances and which hinder conceptualizing and tracking stable units of investigation” (Griesemer 2014a: 24; see also Griesemer 2014b). Some of Griesemer’s preferred illustrations have to do with hybrids of biotic materials of different origins, which are nevertheless pinpointed as apt to help disentangle aspects of cultural development and evolution by similar admixtures of biotic and abiotic factors. Generally speaking, Griesemer applies the category to “developmental relations among parts formed by hybridization of system and environment” (our emphasis), which have a “facilitation” effect on “maintenance, growth, development, or construction” tasks that “would otherwise be more difficult or costly without” (Griesemer 2014a: 26). In this sense, the originally exogenous participant behaves as a “scaffold,” which in Griesemer’s words is “an entity, typically exogenous to the system, unit, or object of interest, which interacts in temporary association with a system to facilitate the development of an outcome or effect […] otherwise […] difficult or impossible to achieve” (Griesemer 2014a: 47). Once the effect is achieved, the scaffold may be “removed” or it may become “invisible” (Griesemer 2014a: 26); but while fulfilling the designated role, it is part of a developmental unit of the organism—i.e. an integral part of the organism itself.

Griesemer’s model applies as straightforwardly to the case of the developmental/functional role of chloroplasts in the case of the green sea slug as to the case of the developmental role of agreement affixes in the case of language. In the former case, taking in chloroplasts feeds the digestive tract facilitating the development of the protective mucous sheath; in the latter, affixes feed the computational system facilitating the development of a high-resolution working space. What obtains in both cases is hybrids that fulfill a developmental role within the respective organic scenarios. As suggested by Griesemer, such a characterization does not entail that the development of the system would not occur, were such units completely out of sight. This is particularly clear in our linguistic case, as witnessed by the fact that verbal agreement is utterly absent in some languages, like Chinese. But this is a normal expectation, if one takes into account that robust systems of development normally act redundantly (Bateson & Gluckman 2011; see Dove 2012 for the case of grammars). In the case of language, means other than verbal agreement—e.g. nominal case inflections (Baker 2015; Blake 2001)—may replace or reinforce its developmental role, which surely may also be supplied by other reinforcing non language-specific stimulation.3

Obviously enough, when the system of computation attains a steady state, agreement morphology is not “removed” from the structure of verbal words; but it may not be inaccurate to say that it becomes “invisible” from the point of view of thought computation, which is a nice way of accommodating Chomsky’s insights on the issue. The eventuality that added non-developmental roles hitch-hike on these items is then but predictable, attending to Minelli’s (2003) plethora of examples of early structures fulfilling a developmental role—or behaving as “ontogenetic adaptations” in Oppenheim’s (1981; 1984) seminal term, and then serving to other non-developmental task in the same organism—e.g. cuticles or flagella serving as the means for restraining a developing form, then “recycled” into protective or motile structures. In this respect, languages offer a very diversified range of phenomena in the area covered in this subsection. For example, the agreement system of English is so impoverished that it can hardly be thought of as serving the kinds of pragmatic functions routinely ascribed to it, yet surely being practical enough for the excitatory role suggested here. German, in its turn, seems to offer a nice example of how agreement systems may “metamorphose” into more suitable versions to the pragmatic roles at hand. What seems clear is that the early developmental role suggested here seems to offer a more coherent inter-linguistic view of why languages incorporate this kind of grammatical machinery.

At this stage of our developmental uptake on language, we must admit that the overall model is sketchy and that many important details still ask for clarification. Admittedly, even the non-trivial issue remains of how the model could possibly be operationalized in order to generate productive research applications. We believe, however, that nothing of the sort can be satisfactorily approached without a prior exploratory program aimed at disentangling the whole landscape, which is how we understand our current aim. Moreover, nothing thus far said should be read as claiming to contain the right analysis of the alleged pieces of empirical evidence, for which much more field research would ideally be required. So, for the time being, it is not particularly crucial for us whether, e.g. the child German data that we have provided contain the most accurate description of facts possible—obviously enough, inasmuch as facts are not overtly betrayed; rather, we are mostly interested in offering the general “environmental cue/organismic response” model for discussion to the field of language acquisition, having clearly in mind that its superiority relatively to other conceivable models is an empirical issue, and thus one to be ultimately decided by means of experiment and observation. In any event, we conceive of our current enterprise as a necessary preparatory step prior to such a desirable state of affairs.

That said, we nevertheless believe that the “environmental cue/organismic response” model is not one expected to eventually defeat “UG-based” models in the arena of empirical research. For one thing: it is not by means of “refutation” that one will prove its superiority and eventually prevail against the other. As we see it, the “environmental cue/organismic response” model is rather a sort of (friendly) eliminativist project relatively to “UG-based” models. To clarify this point, let us briefly refer to a particularly suitable variant of the UG-based approach to language acquisition, namely, the “cue-trigger” model (Lightfoot 1999; see Biberauer & Roberts 2017: 137–139, for a brief presentation).

In a nutshell, this model is based on the idea that fragments of E-language contain the designated “triggers” for “cues” encoded by UG. Thus, according to this model, “cues” are part of the genetic endowment of the learner, or—in other words—part of the a priori contribution of the organism to the course of acquisition. Alternatively, according to the “cue/response” model, “cues” originally belong to the environment and are embodied later on (Rupert 2009), thus eliciting the kinds of organic “responses” that transform the organism into a linguistic phenotype—with no language-specific genotypic mediation required. According to the former model, the environment fulfills the expectations of the organism, which accordingly reacts—e.g. by fixing this or that anticipated parameter; according to the latter, the organism just reacts to the environment, acquiring a linguistic form that is not anticipated neither in the environment or in its structure.

In both cases, however, the attained steady states may be safely described by means of the same UG/parameters jargon. So according to our eliminativist stance, a particular “cue-trigger” unit would simply be read as a way of black-boxing the developmental history of a certain particular aspect of the steady state lacking obvious environmental reflexes. This corresponds to a well-known strategy for starting a research program—namely, by tagging phenomena of interest yet avoiding in-depth treatment, which is legitimate under the obvious condition that the tag does not remain in the long run as the putative explanation of what was originally tagged as asking for explanation (Robert 2004; Craver & Darden 2013). Thus, the best way of characterizing a model like the “cue-trigger” one is by saying that it is not false—at least, not right away false, but provisional—i.e. waiting for being truly deciphered, which is how we envision the aim of the “cue/response” one.

Suffice this reflection as an apt way of introducing the developmentalist challenge (Griffiths & Knight 1998; Longa & Lorenzo 2012; Lorenzo 2013) on language acquisition that we unfold more generally in the next section.

4 Rethinking language developmentally: Skyhooks and scaffolds

The previous section has offered a first approximation to the promiscuity between external and internal factors in language development. Before extending our case in this section, let us stress that our position is stronger than simply denying that language proper is fundamentally external or fundamentally internal; and even stronger that simply claiming that language is partially composed of external materials and partially composed of internal or organic machinery. Our position is that considering how language grows in the mind of children, no such a division between the external and the internal seems to make particular sense. According to our illustrations, children appear to be attuned with aspects of the environmental stimulus, to which they seemingly react brute-causally, given their basic early computational endowment (Marcus et al. 1999; Gervain et al. 2012). We feel tempted to say that children’s computations “hook up to” properties of the world, similarly to our basic conceptual capabilities according to Fodor’s (1998) image. In any event, the properties of concern, once hooked up, do not remain just mimics of their external counterparts, but start a new life of their own, along which they serve as scaffolds to the computational development of children. Complementarily, the system of computation—a bona fide organic entity, according to our view—is in itself, attending to its etiology, so much an internal as an external thing: it is, according to our preferred characterization, a developmental hybrid. Our view is thus not aimed at providing a better arrangement of things external and internal as regarding language, but at dissolving the distinction into this hybrid category. The question certainly deserves some further clarification.

Firstly, an anonymous reviewer points out that the following question remains problematic for our approach: Chomsky’s main reason to reject an E-language concept parallel to the I-language counterpart is that the former cannot be coherently defined, contrarily to the latter; if so, the effort of somehow synthesizing them within an embracing H-language concept—short for Hybrid-language—appears to be doomed to failure, as the resulting category will inherit the incoherence of one of its component parts—namely, E-language. The argument looks at first sight defeating, but it is not. To begin with, as pointed out above, let us stress again that our H-language concept does not purport refurbishing the original E-/I- divide; rather, it entails abolishing it, a move that is supported by the following three criteria for identifying bona fide developmental/organic units (Rupert 2009: 111–118), all of which are well known and considered unproblematic in the philosophy of biology and cognition.

  1. “Nontrivial Causal Spread.” Designated aspects of environmental stimuli have a developmental impact in the shaping of the organic machinery in charge of dealing with them. This makes such aspects suspect of belonging to the same developmental/organic unit. The condition is necessary, yet not sufficient — as witnessed by the fact that it can be said fully fulfilled by run-of-the-mill parametric approaches (Yang 2002; Baker 2001).
  2. “Generalized Transformation Principle.” Said aspects developmentally fine-tune and enhance the capacities of the organic structure of concern, in ways otherwise difficult or harder to attain. Again, the condition is necessary, yet not sufficient. Note, in passing, that this criterion is not in contradiction with the idea of “Learning Organ” fostered by Gallistel (2000; 2010), commented, approvingly, by Chomsky (2002: 84–86), for example.
  3. “Common Fate.” The transformational casual connection that obtains between the structured organism and the structured environment potentially makes them a single unit of inheritance—thus a single developmental/evolutionary entity; aka “inclusive inheritance” (Danchin et al. 2011). If fulfilled, the condition is sufficient.

Note that as we move from (i) to (iii) within this logical framework, nothing remains that possibly resembles the marginal, incoherent, and inert status traditionally attributed to E-language.

Secondly, also raised by an anonymous reviewer, the charge could be made to our approach that it is but a restatement of the internalist position, so H-language is actually a refurbished version of I-language. However, we believe that this objection can only be sustained under a view of the I-language concept that makes the internalist position a trivial one. Let’s think on the related issue of the externalist vs. internalist concepts of meaning (Lau & Deutsch 2016). Even the more radical kind of externalist certainly assumes that mind-internal links exist that connect the individual with shared or public aspects of the environment, considered meaningful in themselves. But the fact, say, that H2O does not instantiate in the mind of the individual directly—i.e. as a clear, colorless, odorless, and tasteless liquid—when she has relevant thoughts about water, is not a counterargument of the externalist thesis. Similarly, the fact that non-interpretive agreement morphology needs to be internalized by the speaker in order to exert the developmental role that we have suggested here, does not entail that the approach is internalist—but trivially. Non-interpretive agreement morphology items are environmental givens—a priori unexpected by the structure of mind, with the capacity of effecting a constructive influence on an internal mechanism, which generates a loop of mutually transformative influence that transforms them into a single integral system. They ultimately compound one and the same developmental system—thus a single organic unit, with no insides and outsides proper. In the case at hand, designated component parts of external sequences (E) have an impact on the internal sequencing mechanisms (I), which paves the way for the processing of more complex sequential patterns—Figure 1. What results is, according to our interpretation, a hybrid system of computation (H), which synthesizes the prior separate contributions of the mind and the environment.

Figure 1 

A schematic representation of the hybrid constitution of the computational system. The figure partially mimics Gottlieb’s (1992) integrative model of probabilistic epigenesis.

Once settled these points of clarification, this new section is devoted to deepen our developmental interpretation of language, emphasizing how the “scaffold” concept serves to (theoretically speaking) scaffold the interpretation itself. What we basically try to illustrate in the following pages, is how the idea of a scaffolded development of language may serve to overcome the biologically unrealistic idea that most of our basic linguistic endowment is somehow preformed in the mind of newborns, including a schedule of its timely unfolding. In other words, we presently try to offer the concept of “scaffold” as a developmentally viable alternative to the “skyhook” of Universal Grammar.

4.1 Scaffolded development

In the previous section, we explored the concept of “developmental scaffold” as recently put forward by James Griesemer to refer to entities typically exogenous to the organism, but which hybridize through their impact on development. Griesemer’s idea is to a certain point continuous with some current approaches within the extended mind paradigm (Clark 1997)—in their turn a vindication of Vygotsky’s seminal ideas (Vygotsky 1986), which stress how cognitive processes take advantage of designated environmental elements, recruiting and transforming them into constitutive parts of relevant computations themselves. In any event, we believe that there exists an important discrepancy between our hybridization-based model and Clark’s and other authors’ extended mind view (Clark & Chalmers 1998). According to the latter, scaffolding effects are directly exerted by environmental items, while according to our own perspective and examples, it is only through processes of hybridization that said effects follow, also paving the way to further purely endogenous scaffolding phenomena. Thus, as suggested by our view, the environment does not dispense the mind from providing architectural and operative specializations relevant to particular tasks: the environment has a developmental role in the complexification of the mind, via processes of mind-environment hybridization. In the following pages we want to endorse the idea and to show that as a matter of fact, endogenous scaffolding effects are pervasive in the case of language, to the point of inspiring the conclusion that this scaffolding technique is the main force underlying the constructive processes of languages in the minds of speakers. In this subsection we offer grounds to the idea, which will be confronted to prevailing versions of the nativist stance in the next one.

A close inspection of the process leading to the earliest forms of linguistic competence—say, somehow arbitrarily, the first three years of postnatal life, reveals that children follow a path through which they successively and increasingly extend their abilities into the phonological, lexical and morpho-syntactic domains. The relevant facts and milestones are nicely reviewed in, among others, Kuhl (2000; 2004), Guasti (2002), and Gervain & Mehler (2010), from whom we draw in the following expedient synthesis. Immediately after birth, children exhibit sound-related capacities that exceed what they could have obtained from simple acquaintance with the environmental stimulation. An illustrative well-established observation is that a few hours after birth, they can tell apart languages belonging to different rhythm typologies, even if their mother tongue—with which they are familiar from their intrauterine experience (Ramus et al. 1999)—is not included in the sample. This means that from birth children are sensorially attuned to very basic phonological properties of languages. The observation is very much within the predictive range of Knudsen’s (2004) typology of neural circuitry, according to which circuits located near the sensorial or motor periphery are the ones prompter to unfold endogenously—i.e. independently of experience, and to support the unfolding of other more plastic circuits/abilities. It seems therefore reasonable to conclude that this inborn sensorial attunement to salient properties of the speech flow is the cat flap through which children start approaching the formal subtleties underlying the surrounding stimuli.

Such bootstrapping effect is first evidenced—accompanied by other simultaneous, but not so easily noticeable advances (Hirsh-Pasek & Golinkoff 1996)—in the phonological domain, where children attain the capacity of discriminating the most relevant distinctions of their native language at the end of the first year of postnatal life, with the exclusion of other inter-linguistically possible distinctions that they give signals of discriminating before that point (see Werker & Tees 2005, for a synthesis). This means that during the process, the phonological competence of children is not a replica of the caretakers’ counterpart, for the former exceeds the discriminatory capacity of the latter. As a matter of fact, the child’s capacity for discriminating foreign distinctions is subject to progressive decay, showing that children traverse in this domain a series of stages with the signaled property of not being just drafts of the adult system: the child system is as much a system converging at the surrounding adult counterpart, as a system departing from other possible instantiations of an adult system. For us, this is part and parcel of saying that the steady phonological stage is a hybrid system in exactly the sense of the previous section, the unfolding of which is exogenously scaffolded by the phonetic instantiations of the native phonological distinctions. The litmus test for the idea is precisely the fact that early systems of phonological competence incorporate distinctions alien to the local adult counterparts. The task of learners is thus different from simply adding distinctions to an internal inventory. It rather is an extractive task of modeling (Mehler 1974), in which exogenous entities exert a scaffolding role—exactly in Griesemer’s sense, the final outcome of which is an external/internal compromise of sorts—i.e. a hybrid.

First word-like units arise at around one year, thus concurrently with the point at which the child’s phonology converges with the adults’ distinctions. Word extraction is a non-trivial task that children confront with the help of a phonological and statistical “self-aid kit,” which allows cracking the continuous speech flow and detecting likely “word candidates” (Kuhl 2000; 2004; Guasti 2002). Abilities other than these need to be recruited as well for establishing presumptive meaningful associations, a task to which the child’s social intelligence seems to critically contribute (Bloom 2000). In any event, it seems inescapable the conclusion that the phonological competence already in place exerts a crucial scaffolding influence on the construction of an early lexicon, in this case of an endogenous sort, taking into account, for example, the instrumental role of the identification of typical segmental repetitions or the calculus of phonotactic regularities. Note also that the lexicon is a domain to which the litmus test of hybridization provides a positive result, for an early lexicon is not just a replica of a public or external lexicon. It obviously takes advantage of prominent entities within externalized utterances and points to a convergent steady state of lexical competence relatively to other competent speakers, but it also contains word-like units and constructive principles or biases (e.g. the avoidance of homonymy/synonymy, phenomena which are pervasive in adult lexicons) alien to the purported models. A lexicon is thus a self-supporting developmental entity that relies on both exogenous and endogenous scaffolding effects, which makes it an exemplary illustration of a cognitive hybrid.

Children’s adult-like productive word combinations are customarily dated at age two-and-a-half. But chronology is surely an artifact also in this area, where the effect of having attained a certain critical mass of lexical items is commonly pinpointed as a crucial inciting factor (Locke 1997; Bates & Goodman 1999). The case may be satisfactorily resolved in terms of the negotiation between the declarative memory system at the service of learning and storing words and the procedural memory system in charge of learning and subserving patterned word combinations, as in Michael Ullman’s model (Ullman 2001). Details are not important to our purposes in this section. Suffice it to say that were the idea on the right track, the same pattern previously suggested regarding other linguistic domains seems to be active as well in this one: on the one hand, the lexicon thus far stored acts as an endogenous scaffold for productive syntax, while on the other hand, externalized utterances provide prominent entities (agreement, case, and so on) that contribute as exogenous scaffolds, according to the view put forward in the previous section. Other endogenous and exogenous influences are probably also relevant to the effect—see Locke (1993) for some suggestions, but they putatively serve to strengthen the same idea: external and internal factors conspire to incite the workings of the computational system and to enhance its associated operative memory until it reaches its species-typical power.

4.2 Fighting skyhooks: A brief reflection

The preceding subsection points to an image of language according to which different customarily more or less agreed upon specialized subsystems find an unexpected source of justification for their “being there:” they play important developmental roles in the processes leading in due course to a full-fledged system of linguistic competence. In this respect, it is important to emphasize that the resulting picture is clearly more apt than other prevailing theoretical models for the accommodation of language to an unnegotiable premise of current developmental biology (Minelli & Pradeu 2014): Namely, that contrarily to folk intuitions, “development—as nicely put by George Michel and Celia Moore—emerges from earlier conditions; it is not directed toward later conditions” (Michel & Moore 1995: 21). Developmental processes are, appearances notwithstanding, contingent, extremely sensitive processes to the here and now conditions. The fact that more often than not they lead to expected, predictable outcomes is not a primitive distinguishing feature of development, but itself an outcome of a conspiracy of mechanisms that endow them with robustness. Development, in a nutshell, is neither somehow inscribed into initial or earliest stages, nor it has some “look ahead” power of sorts.

Some non-trivial consequences directly follow from the previous statements. Firstly, within the suggested framework it makes no sense to posit—along the lines of Borer & Wexler (1987) or Wexler (1990)—that a genetically encoded chronogram exists in charge of the timely unfolding of specific aspects of I-language. Chronograms are artifactual, not a genuine currency of developmental negotiations, milestones and progression. Stages are unexceptionally outcomes of contingent encounters of a previous stage and certain surrounding exogenous and endogenous conditions (Lorenzo & Longa 2009).

The “genetic chronogram” idea is not, to be fair, one around which a generalized consensus has existed within Chomskyan linguistics. However, the most popular alternative to said thesis has traditionally been “radical preformationism,” according to which newborns are endowed with an inborn full linguistic competence that despite appearances, does not even develop. The idea has received different technical implementations along the years (Hyams 1986; Yang 2002), but all point to Chomsky’s contention that language can be studied not just under the simplifying heuristic of abstracting away the course of development, but taking as an empirically testable assumption that as a matter of fact it does not develop (Chomsky 2000). It is however hard to understand how an assumedly biological approach to language can take as one as its default premises one that transforms language into a complete organic exception.

The advent of the Minimalist Program (MP) and the relaxation of claims about the existence of a linguistic genotype (Chomsky 2005) seemed to bring about the opportunity of correcting the extreme traditional preformationist stance of Chomskyan linguistics (Lorenzo & Longa 2009; Longa & Lorenzo 2012). Things have not as a matter of fact followed this path. The orthodox—or “consensual,” Hornstein at al. (2005)—interpretation of the MP is one that divorces the challenge of explaining why languages show their specific design properties and the challenge of explaining how children attain the knowledge of objects with the designated properties without relevant models or instruction. The concerns are different, according to Chomsky’s (2004; 2007b) interpretation: the former, new goal leads to a level of theoretical adequacy that is “beyond” classical explanatory adequacy of the sort that the latter, traditional goal inspires. Explanatory adequacy relates with the aim of establishing the contents of Universal Grammar (UG); “beyond” that aim, a new one emerges that relates with the aspiration of understanding why UG is the way it is, instead of instantiating any other conceivable form. Chomsky’s view—aptly expounded in Hornstein et al. (2005)—is that the question of design that the MP raises does not touch the question of acquisition, which, according to the referred source, has already received a satisfactory general answer by positing an inborn, yet relatively plastic set of language-specific principles in the mind of children—i.e. the Principles and Parameters Theory (PPT).

It is however dubious, to say the least, that the PPT contains a true answer to the question of acquisition. As noted before, it rather black-boxes it, without providing bona fide explanations about how children really attain their mastery of language. Principles and parameters are descriptive categories of comparative grammar, each of them corresponding, in terms of uniformity, to a different degree of cross-linguistic coverage. But against conventional wisdom, they lack any explanatory power on acquisition matters—which in classic Chomskyan terms means that they do not have explanatory power at all (Chomsky 1965). However, this does not mean that they cannot get interpreted from the perspective of the theory of acquisition. According to one such possible reading, for example, “principles” may point to strongly “entrenched” factors in language development—in Wimsatt’s (1986) sense; see Dove (2012) for the case of grammar, while “parameters” may correspond to “canalized,” yet more easily perturbable ones—in Waddington’s (1957) sense. Granted. But without losing sight of the fact that specific formulations of principles and parameters are descriptively abstract statements of aspects of the steady states of particular languages—with the double entailment that what they properly describe are “outcomes,” and that they are “adultocentric.” Enough to prove that they do not (can not) belong to the explanatory currency of a theory of development/acquisition. Again, this is not to deny that they may render an important instrumental service to the better understanding of acquisition, namely, when they assume the role of just tagging relevant phenomena, until a proper developmental understanding (i.e. along the lines suggested in this paper) may be provided. Otherwise, they just render lip service to the theory of acquisition.

On a more parochial tone, it is hard to understand that the MP, customarily thought of as a bona fide biological approach to language (Berwick & Chomsky 2016: Chapter 3), divorces, as a matter of principle, the goals of explaining, on the one hand, the properties of organic design exhibited by language—for which a poor (or empty) UG is envisioned as an explanatory ideal, and, on the other hand, how the human organism incorporates in its early experience a system with the designated properties—for which UG continues to be pinpointed as an unavoidable requirement (Chomsky 2007b). To say the least, the whole project sounds paradoxical. In contrast to this confounding landscape, the alternative that we have delineated throughout this paper is straightforward, since it conceives of explaining the properties of the outcomes of development and explaining how they are attained as indistinguishable goals, which must be referred to the constraining powers of the interactions between the ongoing constitution of the organism and its typical environment at different developmental stages—perhaps a threat to the “isolationist” image of Chomsky’s LOT thesis. Biologically speaking, the form that a particular organic structure exhibits is but the outcome of its developmental course up to the point at which it is observed. So, the most reasonable (or default) theoretical attitude seems to consist in thinking about development as the ultimate source of explanations about the way a designated structure is. Language cannot be an exception if one takes the biolinguistic program seriously (see Robert 2004, for a suitable point of reference). Formal properties of languages—optimal, as hypothesized by the MP, or otherwise, a non-trivial matter of discovery—cannot be but the reflex of the way they grow in the mind of children—or if one prefers, the way children acquire them.

The views put forward in this paper may serve to offer a biologically informed frame to confront this challenge, without incurring in the shortcomings associated to the preformationist stance. According to them, development inspires the partition of language into component parts, the growth of which are to a certain extent independent, yet exercising crucial scaffolding influences on each other. This “near decomposability” (Simon 1996) property is not alien to the MP, when framed within the programmatic proposals of Hauser et al. (2002)—see also Boeckx’s (2012) “I-mosaic” concept, and Balari & Lorenzo’s (2015b) “gradient of language” concept, as alternatives to the idea of a monolithic “faculty”. Our suggestions in this paper contribute to this overall framework with two non-trivial addenda—in lines not uncongenial to current minimalism: firstly, general principles that apply to the constructive processes of these kinds of composite organic structures are likely to be the source of many design properties of the outcomes of language growth; and secondly, the minds of children could be freed from any kind of anticipatory knowledge of the properties of concern—such “principles” of design need not be taken as the propositional content of some kind of a priori knowledge. Moreover, if the hybrid concept of language is on the right track, then the incorporation of environmental factors into explanations will be ready to transcend its timid association with interlinguistic variation customarily accepted by practitioners of the MP (Chomsky 2005). Environmental factors are not to be excluded from the developmental causes of cross-linguistically uniform properties of languages, as defended in this paper as regards the language-typical operative memory regime and the design properties entailed by it—the Basic Property, to start with. The corresponding properties could be safely freed from the unrealistic idea of a blueprint containing inborn linguistic knowledge.

5 Conclusions

“Weak” (or methodological) and “strong” (or ontologically loaded) versions of the E-/I- divide have been suggested in recent conceptualizations of the linguistic phenotype. The main take-home massage of this paper is that both versions can be now surpassed. As for the weak version (Lohndal & Narita 2009), we believe that time is ripe for transcending this “divide and conquer” strategy, for recent achievements of the theory of development pave the way for putting aside the kinds of cautions that motivate it and for aiming at modeling a more integrative view on the linguistic phenotype. As for the strong version (Chomsky 1986; 2016), we contend that it is just unmotivated and wrong. For even focusing on the core, allegedly deepest layer of the faculty of language—i.e. the computational system, as customarily conceived of within the Chomskyan paradigm, what one reasonably envisions is a chain “environmental cues ↔ morphophonological transducers ↔ computational system,” which asks for a conceptualization as an integrated unit of biolinguistic analysis; thus a “hybrid” that relativizes (if not thoroughly defeats) internalism under any possible disguise.

The hybrid of language concept, to be sure, departs from the two main reference frames of linguistic theory of the last century. On the one hand, it departs in very obvious ways from de Saussure’s (1916) characterization of language as a social object (la langue) that properly belongs to speaking communities and only derivatively to speakers. The proper object of linguistics thus shares the same ontological status that Durkheim (1894) had previously suggested to sociological phenomena at large: namely, abstract, external, and supra-individual objects, which de Saussure referred to as “values.” Consequently, de Saussure rejected the idea that languages need being naturalistically approached and understood. The problem of de Saussure’s project is not so much how it divorced linguistics from biology as the disembodied concept of language itself: abstract objects simply do not exist.

Chomsky’s (1986) biolinguistic approach was specifically aimed at refuting the idea of language as an abstract or Platonic object—defended, among other places, in Katz (1981) or Soames (1984). But the framework was historically motivated as a refutation of classical learning paradigms as able to capture the way speakers internalize their knowledge about such external objects (Chomsky 1959). In any event, Chomsky’s alternative did not exactly translate into the abandonment of the learning paradigm, for his conclusion is that languages are learnable in as much as most of the speaker’s knowledge is biologically fixed from the start, which actually serves as a learning device (Chomsky 1975). To be fair, such a device is just a black box of something that if really biological, cannot be but the output of developmental processes. Truly unifying biolinguistics with biology—a goal still lying in a distant future, makes inescapable the goal of unboxing the device. The adoption of the hybrid concept of language defended in this paper has radical consequences in this respect, for it makes dispensable the very idea of “learning:” once the boundary between the external and the internal blurs, the boundary between what learns and what is learned also blurs. Learning was probably a provisional metaphor—and a good one. But development is enough. While this may seem the end of biolinguistics as we know it, we rather think that it is the solid ground from which a bona fide biological approach to language must grow.