1 Introduction

The vast majority of the documented sign languages of deaf communities have a category of verbs that may be referred to as indicating verbs (Liddell 2000), but are more widely known as agreement verbs (or agreeing verbs) in the sign language linguistics literature (e.g., see Mathur & Rathmann 2012 for an overview). An example of an indicating verb in the related sign language varieties used in Britain (British Sign Language, or BSL) and Australia (Auslan, or Australian Sign Language)1 is the sign shown in Figure 1 glossed as PAY.2 In its unmodified form, this sign is produced with the dominant hand moving away from the signer’s body. By modifying the movement and orientation of the dominant hand, the sign may be directed at referents that are physically present in the space around the signer, or towards locations that may be associated with absent referents. When the dominant hand in the sign PAY is moved away from the signer’s body towards the addressee’s location, this produces a form meaning ‘I pay you.’ To create a form that means ‘you pay me’, the dominant hand’s orientation and movement is turned in the opposite direction, so that it moves from near the addressee’s location towards the signer’s body.

Figure 1 

BSL/Auslan indicating verb PAY.

In many descriptions of indicating verbs, it is the modification of the initial and/or final location and/or orientation of the hand(s) that is analysed as a morpheme marking person agreement with the subject and/or object arguments of the verb3 (e.g., Padden 1983; Rathmann & Mathur 2002). This was first proposed for American Sign Language (ASL) by Padden (1983), building on earlier work by Friedman (1975), Kegl (1977; cited in Wilbur 1987), Fischer and Gough (1978) and Meier (1982). The location and/or orientation modifications of the citation form’s formational structure are widely considered to be analogous to the various suffixes that mark person agreement in spoken languages such as Spanish (e.g., yo habl-o ‘I speak’ versus ella habl-a ‘she speaks’). This notion of agreement has also been extended to other aspects of sign language structure, such as auxiliaries (e.g., Mathur & Rathmann 2012), the use of eye-gaze (Neidle et al. 2000), and forms of enactment (known as constructed action or role shift) (e.g., Herrmann & Steinbach 2012), some of which we also touch on in this paper. Our primary focus here is, however, on indicating verb constructions.

Some researchers, however, argue against the analysis of indicating verb signs as marking person agreement with the verb’s arguments (Liddell 2003; Schembri & Johnston 2007). It was Liddell (1995) who first proposed that variation in the directionality of such signs does not mark agreement with a co-occurring noun phrase, but works through the incorporation of a pointing gesture into the form of the sign. A pointing gesture is defined by Kita (2003b: 1) as ‘a communicative body movement that projects a vector from a body part’. This vector indicates a particular physically present referent, or a location associated with an absent one. In the case of indicating verbs, the articulators involved in a specific verb sign are directed towards or away from locations in the space around the signer. Thus, Liddell argues, any movement of the hand(s) in a sign towards such a location projects a vector and signals an association with the referent in the same way as a pointing gesture by a non-signer (cf. Kendon 2004). As such, the primary function of this directionality seems to be reference tracking (Fenlon et al. 2018). Liddell’s (2003) analysis, in which he describes indicating verbs as a fusion of morphemic and gestural elements, draws on Langacker’s (1987; 1991) notion of cognitive grammar which sees speech, sign, and gesture as all part of a broader notion of “language” (cf., Ferrara & Hodge 2018). Since this time, the number of scholars that have moved away from an agreement analysis has grown, though many alternative accounts (e.g., Lillo-Martin & Meier 2011; Wilbur 2013; Wilcox & Occhino 2016) do not accept the pointing gesture analysis in Liddell’s proposal.

In this paper, we build on Liddell’s (2003) cognitive grammar analysis of indicating verbs by introducing to his account some key concepts from Construction Grammar, and by specifically suggesting that these represent a sign language equivalent to ‘multimodal constructions’ found in spoken languages (as proposed by Andrén 2010; 2014; Harrison 2010; Zima 2017a; b; Blackwell et al. 2015 amongst others). We also discuss some issues with the agreement analysis, demonstrate how these can be accounted for under the novel Construction Grammar approach we outline here, and explore how supporting evidence is emerging from corpus-based studies of indicating verbs. We also refer to a wide range of studies in the gesture literature to support an account that describes verb modification in sign languages as constructions of morphemes and deictic gesture. Such comparisons invite us to reconsider the relationship between signed languages, spoken languages and co-speech gesture and thus highlight some key differences between the nature of indicating verbs in sign languages and agreement systems in spoken languages.

2 Verb typology in sign languages

In this section, we will first outline a typology of verb morphology in sign languages. Since the proposal was first made by Padden (1983), verbs in sign languages have been categorised by many scholars (e.g., Meir 2002; Aronoff et al. 2004a; Meier & Lillo-Martin 2010) into three main types that differ with respect to the morphosyntactic expression of arguments: (1) agreement verbs, (2) spatial verbs and (3) plain verbs.

We have already introduced agreement verbs, such as PAY, above. This sign is usually analysed in the literature as an example of a double agreement verb, as it moves between two locations: from a location associated with a subject argument to one associated with an object argument. Some signs, such as THANK in Figure 2, may act as a single agreement verb, moving from a fixed location on the body towards a location associated with only one argument: the object argument. Both double and single agreement verbs may also move from a location associated with an object towards one associated with a subject, as with TAKE and LEARN in Figure 3: these are both examples of a backwards agreement verb (sometimes contrasted with a regular agreement verb like PAY or THANK).

Figure 2 

BSL/Auslan THANK.

Figure 3 

BSL/Auslan TAKE and LEARN.

Spatial verbs, such as BSL/Auslan PUT shown in Figure 4, work in a similar way, but in contrast to agreement verbs, the use of locations in space represents movement between physical locations and is not associated with animate arguments. In addition, a subset of spatial verbs (often referred to as classifier constructions or depicting verbs) include a range of morphemic handshapes (widely known as classifiers in the sign language linguistics literature) that represent different classes of referent.4

Figure 4 

BSL/Auslan spatial verb PUT.

This distinction between agreement and spatial verbs is, however, not based on morphological differences in spatial patterning, only semantic ones (Engberg-Pedersen 1993). Due to the lack of any difference in form, researchers working on several sign languages have found it difficult to distinguish consistently between the use of space to signal person agreement and to express locative relations (Engberg-Pedersen 1986; Johnston 1989; Bos 1990; Johnston 1991; Quadros & Quer 2008).

Unlike agreement and spatial verbs, plain verbs such as BSL/Auslan KNOW in Figure 5 are relatively fixed in form. There are no alterations in the handshape signalling different classes of referent, unlike what we see in some spatial verbs. It is generally also claimed that they cannot have their location modified to show associations between spatial locations and referents in the same way as agreement verbs (although Padden 1983, and others do in fact discuss the fact that some plain verbs may indeed be modified spatially, so the distinction between plain and agreement verbs is, like that with spatial verbs, somewhat problematic).

Figure 5 

BSL/Auslan plain verb KNOW.

3 Two interpretations of indicating verbs

In this section, we provide two broad analyses of indicating verbs. We focus first on what might be called the “mainstream view” which developed out of Padden’s (1983) tripartite division of verbs in sign languages – that directionality in these verbs is similar to agreement systems in spoken languages. Following this, we introduce our Construction Grammar analysis building on Liddell (2003) and others which explores the role of deictic gesture in these verbs. We will henceforth refer to our proposal as the indicating verb construction account.

3.1 Agreement analysis

3.1.1 Introducing agreement

In general, the term agreement in linguistics, first used in this sense by Bloomfield (1933), refers to the presence of some co-variation in form between different lexical items in a clause that serves to express grammatical relationships such as gender, case, person and/or number. There is, however, much debate and discussion in the linguistics literature about what grammatical phenomena should be included within the category of agreement systems. Within cognitive linguistics, for example, the relationship between agreement systems and other related forms of reference tracking has been explored (e.g., Croft 2001; 2013; Langacker 2008; Kibrik 2013). Given that the boundaries of agreement sometimes seem unclear, some even propose abandoning the term (e.g., Haspelmath 2013). For the purposes of this paper, however, we will focus on a definition of overt morphological expression of agreement that draws on major typological work on the topic (Corbett 2006). This particular definition forms the basis of a “canonical typology” of agreement which attempts to acknowledge the many complications associated with this notion, by proposing that agreement systems cross-linguistically show similarities and differences, making some more “canonical” than others. Corbett (2006), drawing on his earlier work (e.g., Corbett 1988; 1991), adopts a kind of meta-definition that attempts to bring together the essential elements of agreement proposed in the typological literature. In particular, Corbett (2006) argues that the definition first proposed by Steele (1978: 610) captures key aspects of the phenomenon, as in (1).5

(1) Agreement: some systematic co-variance between a semantic or formal property of one element and a formal property of another.

Researchers working on agreement in sign languages (e.g., Casey 2003; Aronoff et al. 2005; Mathur & Rathmann 2010; Lillo-Martin & Meier 2011; Rathmann & Mathur 2011; Costello 2016) tend to either explicitly adopt/assume Corbett’s definition of agreement, or they do not explicitly define the term “agreement” at all.

For Corbett (2006), the element that controls agreement marking is referred to as the controller. In example (2), the controller is the subject noun phrase ‘the laptop’.6 The element whose form co-varies in the presence of the controller is the target, and the target varies to reflect formal or semantic feature(s) of the controller element, such as person, number or gender: here the target verb ‘works’ shows agreement in person and number features. ‘The laptop’ is third person and singular, and so the agreement morphology on the verb reflects this with the suffix –s.

(2) The laptop works.

The domain of the agreement is the clause and there are no conditions for this agreement to take place (e.g., the animacy of the referent of the controller noun phrase is not relevant here, as it might be in some agreement systems, for example, in Miya, or in Turkish; see Corbett, 2006).

In a footnote, Corbett (2006: 264) observes (and later Cysouw 2011 supports this observation for agreement/concord) that indicating verbs in sign languages do not always appear to show co-variance between controller and target, unlike what is explicitly suggested by some in the sign language linguistics literature (e.g., Janis 1995). It is this claim that we wish to explore further in this paper.

Beyond the general definition of agreement as provided in (1), Corbett (2006) also presents a detailed overview of the essential features of what he calls “canonical” agreement: the clearest examples in the literature of agreement according to the definition he adopts. Gender agreement in the Italian noun phrase is one such example of canonical agreement, as in examples (3) and (4).

    1. (3)
    1. il
    2. DEF.M.SG
    1. nuov-o
    2. new-M.SG
    1. quadr-o
    2. picture(M)-SG
    1. ‘the new picture’
    1. (4)
    1. la
    2. DEF.F.SG
    1. nuov-a
    2. new-F.SG
    1. tel-a
    2. painting(F)-SG
    1. ‘the new painting’

These examples exemplify canonical agreement because the controller (i.e., the noun) is present, it has overt expression of the features of singular gendered nouns in Italian (i.e., the noun endings -o vs. -a), and it triggers consistent patterns of agreement on the targets (e.g., the feminine nouns trigger feminine agreement). In addition, the target (the adjective) has bound morphemes expressing the agreement marking (i.e., an affix rather than a clitic or free morpheme) and represents a clear case of inflection.

It might be suggested that Corbett’s (2006) definition of agreement is too narrow. For example, some factors apparently involved in agreement systems, it might be argued, do not reflect semantic or formal properties of the controller. Norwegian, for example, seems to exhibit co-variance that reflects contextually-dependent definiteness. However, Corbett (2012) points out that in Norwegian examples such as (5), there is a mismatch in the marking on the determiner and the noun. This missing co-variance between a controller and target leads to a lack of consensus in the literature whether this is an agreement system or not.

    1. (5)
    1. mitt
    2. my.N.SG
    1. ny-e
    2. new-DEF.SG
    1. hus
    2. house[INDF]
    1. ‘my new house’

Additionally, Algonquian languages display proximate/obviate agreement. This refers to a system of morphological marking on nouns and pronouns to distinguish between multiple third person referents (e.g., Bliss 2017). In such cases, one third person argument is marked proximate (i.e., as more salient or topical in the discourse) and all others are marked as obviative (i.e., as less salient), as in the Blackfoot example in (6) where John is the topic, and moozw- ‘moose’ is marked as less important in the context. Which argument is marked thus depends not on semantic or formal features of a controller, but on the particular context of the utterance.

    1. (6)
    1. John
    2. John
    1. o-waab-am-aa-an
    2. 3.SG-see-TA-DIR-OBV
    1. moozw-an
    2. moose-OBV
    1. ‘John saw the moose’

Similarly, in languages with number or gender marking, a noun phrase marked for number or gender might reflect the number or female gender of the actual referent(s) (Corbett 2012). We can see this in the French example in (7), uttered by a female speaker.

    1. (7)
    1. je
    2. 1.SG
    1. suis
    2. be.1.SG
    1. content-e
    2. happy-F.SG.
    1. ‘I am happy’

Thus, the motivation for number/gender marking may be discourse-context dependent, and not related to any semantic features of the relevant noun phrase. In the French case, however, all nouns and third person pronouns inherently carry one value of grammatical gender: masculine or feminine. As such, Corbett (2006) proposes that there is covert gender marking in the first person pronoun. While it is not shown on the controller, all adjectives it occurs with are targets of agreement and have to come in either masculine or feminine form. In the case of both obviation and number/gender marking as well, once this marking appears on the controller noun phrase, this formal feature is reflected in associated targets, and thus this fits into Corbett’s (2006) notion of covariance.

Corbett’s (2006) work is almost exclusively based on typological studies of spoken languages, and thus it might be suggested that it is incomplete because it does not draw on data from sign languages. It is, however, perfectly possible for sign languages to develop agreement systems that mark formal or semantic properties of a controller in ways that parallel what we see in canonical agreement in spoken languages. For example, Japanese Sign Language has handshapes that represent males and females respectively (Fischer & Osugi 2000). If these handshapes were consistently used in verbal constructions that reflected the gender of the controller noun phrase, and if all nouns were overtly or covertly marked for gender, then we would have a verb agreement system similar to the one we see in the examples above. This kind of co-variance is not what we always see, however, in the spatial modification of indicating verbs in sign languages such as ASL, BSL and Auslan as discussed below.

3.1.2 Verb modification as agreement in sign languages

The majority of linguists who propose an agreement analysis of indicating verbs work within Generative models of language (e.g., Sandler & Lillo-Martin 2006), which propose divisions between the lexicon (in which the vocabulary is stored) and a grammar (conceptualised as a system of rules that generates grammatical combinations of morphemes and words in the language). Under agreement analyses of verb directionality in sign languages, it is the modification of the initial and/or final location and/or orientation of the hand(s) that is analysed as a morpheme marking person agreement either similar to the various suffixes that mark person agreement in spoken languages (e.g., Padden 1983; Rathmann & Mathur 2002), or alternatively as a type of alteration of the verb stem (e.g., Costello 2006). In Generative rule-based accounts, this agreement is realised by the association between a referent and a location in the signing space, sometimes called a referential locus, or R-locus (cf. Lillo-Martin & Klima 1990). The association may be made explicitly, for example, through the production of a pointing sign which is directed towards a location around the signer’s body immediately preceding or following a nominal sign. Verbs are then oriented and moved in space with reference to these R-loci, and this serves to mark agreement with their arguments. Although the majority of sign language researchers appear to accept this analysis, there is also recognition that this form of agreement displays some unusual properties. Lillo-Martin and Meier (2011) discuss how there is little consensus between rule-based accounts about why it is that not all verbs (e.g., plain verbs) show directionality, and of those which do exhibit this behaviour, why they do not all make use of directionality in the same way: why some signs are regular indicating verbs, for example, and others are backwards verbs. As explained earlier, it is widely recognised that some forms mark object agreement only (e.g., THANKY in BSL) (this is sometimes called defective agreement in rule-based accounts, see Costello 2016). No verbs in any of the sign languages thus far described, however, use directionality to mark subject only. In fact, it is suggested that object agreement in sign languages is obligatory and marking agreement with the subject argument is optional (Meier 1982; Lillo-Martin & Meier 2011), an unusual pattern when compared to spoken language agreement systems.

In terms of the criteria proposed by Corbett (2006), researchers also disagree about the relevant feature for agreement marking in sign languages. Most assume it is person (e.g., Padden 1983; Sandler & Lillo-Martin 2006), although some explicitly reject this and propose novel alternatives, such as location (e.g., Cormier et al. 1999; Zwitserlood & van Gijn 2006) or identity (Costello 2016). Scholars also do not agree about whether sign languages exhibit canonical or non-canonical agreement. Mathur and Rathmann (2010) suggest that indicating verb systems fulfil most of the key criteria for canonical agreement (a claim repeated in Rathmann & Mathur 2011). Lillo-Martin and Meier (2011), however, claim that indicating verbs represent a non-canonical form of agreement. In a detailed and exhaustive description, Costello (2016) argues that this non-canonicity has been overstated, and that sign languages share many of the important canonical features of agreement in spoken languages. We do not wish to dispute that indicating verbs appear to share many of the properties found in agreement systems, but, as Quer (2011) points out (and as Costello 2016 also acknowledges), the degree of canonicity is not the key issue at stake in considering the appropriateness of an agreement analysis. Like Liddell (2011), we wish to argue that it is the nature of the directionality in indicating verbs, and what controls this directionality, that represents the most critical aspect of the debate.

3.1.3 Other types of agreement in sign languages

Other phenomena beyond spatial modification have also been proposed as involving agreement in sign languages. For example, Neidle et al. (including Bahan 1996; Lee et al. 1997; Neidle et al. 2000), Thompson (2006), and Thompson et al. (2006; 2009) argue that eye gaze is a grammatical non-manual marker of verb agreement in ASL. The claim generally is that signers regularly shift their eye gaze towards the location associated with object arguments when producing indicating verbs and that this constitutes a non-manual instantiation of agreement features. For Neidle and colleagues, eye gaze is used as a grammatical marker of agreement with all verb types; Thompson and colleagues argue that eye gaze is a grammatical marker of agreement only with agreement verbs.

Another phenomenon claimed to constitute agreement in sign languages is role shift, also known as constructed action (Metzger 1995; Cormier et al. 2015b) – i.e., a form of enactment, where one or more bodily articulators (including the head, face, eye gaze, arms and torso) are used to mimetically represent the utterances, thoughts, feelings and/or actions of one or more referents (also used in multimodal direct quotation by non-signers, e.g., Stec et al. 2016). It is the use of non-manual markers during role shift that has been considered by some to be part of an agreement system. For example, Kegl (1995) describes what she called a role prominence marker in ASL – specifically a role prominence clitic. She proposes that these non-manual features act as a subject clitic, that the subject noun phrase agrees with this clitic, and that it is interpreted with role prominence such that it marks the person from whose perspective the event is viewed. More recently, Herrmann and Steinbach (2012: 221) argue that non-manual markers including eye gaze change, head position, body lean and/or facial expression act as agreement markers in German Sign Language. They propose that “role shift does not agree with syntactic arguments but with higher level semantic entities, namely the signer and the addressee of a reported utterance”.

Claims about non-manual features as agreement (eye gaze with verbs, and also role shift) are relevant to the indicating verb construction account that we propose here. Under this Construction Grammar analysis, the use of eye gaze during verb modification and role shift/constructed action are not independent phenomena at all but can be accounted for together in one unified analysis, as we explain below in §6.

3.2 Indicating verb construction analysis

In this section, we present our analysis of indicating verbs building on the seminal work on ASL by Liddell (2003). It was Liddell (1995) who first proposed that variation in the directionality of indicating verbs works through the incorporation of a pointing gesture into the form of the sign. Thus, Liddell argues, any movement of the hand towards such a location signals an association with the referent in the same way as a pointing gesture would by a non-signer. We extend this analysis here by proposing, building on work by gesture researchers (e.g., Andrén 2010; Harrison 2010; Zima 2017a; b), that these verbs are typologically unique unimodal constructions (comparable to multimodal constructions in spoken languages).

In this section, we provide a brief introduction to Construction Grammar and refer to work on multimodal constructions to contextualise our description of indicating verbs. Note that in adopting this indicating verb construction analysis, we are assuming a broad notion of “language”, one that includes sign, speech, and gesture (see also Wilcox & Occhino 2016). We adopt here Kendon’s (2014) notion of “semiotic diversity” and use the terms morpheme and deictic gesture as reflecting different ways of making meaning within language (akin to the Peircean notions of symbols and indices in Ferrara & Hodge 2018). We do not, however, intend for our use of “morpheme” and “gesture” to stand in opposition to one another (where one is “linguistic” and the other is not). We consider both aspects to be extremely relevant for a linguistic description of any language – spoken or signed (cf. Kendon 2008; 2014; 2017; Ferrara & Hodge 2018).

3.2.1 Introducing constructions

We use the term construction following the work of Adele Goldberg (e.g., Goldberg 1995; 2003). In this approach to grammar, constructions are symbolic units that constitute a pairing of form and meaning. Furthermore, this theory claims that a construction is the only unit of grammatical representation. Constructions are conceptualized as holistic “conventionalized clusters of features (syntactic, prosodic, pragmatic, semantic, textual, etc.) that recur as further indivisible associations between form and meaning” (Fried 2015: 974), as shown in Figure 6.

Figure 6 

Constructions as links between form and meaning (adapted from Croft 2001: 18).

There is a continuum from schematic complex constructions (corresponding to syntactic rules in other theoretical approaches such as Generative Grammar) to substantive atomic constructions (that is, words from the lexicon). Hoffmann and Trousdale (2013) provide examples from this continuum, which we adapt here – see (8) to (11).

(8) Word construction: apple [apple]: ‘apple’

(9) Idiom construction: She takes him for granted [X TAKE Y for granted]: ‘X doesn’t value Y’

(10) Comparative construction: Kim is taller than you [X BE Adjcomparativethan Y]: ‘X is more Adj than Y’

(11) Resultative construction: She wipes the table clean [X V Y Z]: ‘X causes Y to become Z by V-ing’

The word in (8) is a simple pairing of form and meaning. As Hoffmann and Trousdale (2013) explain, the meaning of (9) is only partly compositional and thus, it is stored in the speaker’s mental lexicon as a unit. This idiom is partly schematic, as it has open positions for subject and object arguments that can be filled with various elements, and partly substantive, as its form is fixed in parts. The construction in (10) is even more schematic, with only one substantive element [than], while (11) is completely schematic, with open slots for the cause X, verb V, complement Y and resulting state Z (which could be an adjective, or a verb).

Constructions are organised in a network, chiefly by taxonomic and part-whole relations. Mental representation of a construction is determined not only by the (non)predictability of the constructional properties, but also by token and type frequency (Bybee 1985). Construction Grammar thus gives usage a fundamental role by proposing that all grammatical systems are based on, and abstracted from, actual language use and depend on domain-general processes including entrenchment (explained below) and chunking (a chunk is a unit of memory organisation, formed by bringing together other units of memory and combining them to form a larger unit; see Bybee 2010). It is not surprising that such an approach can easily accommodate symbolic units that incorporate gestural elements.

Andrén (2010) uses the term multimodal construction to refer to conventionalised constructions that involve both morphemic and gestural elements – specifically uses of headshake co-occurring with speech in signalling negation. Just like unimodal spoken language constructions, he suggests that multimodal constructions range from relatively fixed (e.g., “word like”) constructions to more flexible and productive combinations (e.g., “grammar like”). More recently, Zima (2017a; b) investigates the use of the semi-lexicalised English constructions [V(motion) in circles] and [all the way from X PREP Y]. The data were collected from an audio-visual corpus of spontaneous language samples from a range of discourse types. This work was inspired partly by recent work on multimodal constructions within Construction Grammar (e.g., Steen & Turner 2013) which extend Goldberg’s work to include visible aspects of language. It was also partly inspired by the existing gesture studies literature on motion event descriptions (McNeill & Duncan 2000; Kita & Özyürek 2003; Hickmann et al. 2011). These studies find that such spoken language descriptions were relatively often accompanied by gesture. In the case of [V(motion) in circles], for example, specific motion verbs (e.g., go, swim, fly) occurred preceding the prepositional phrase in circles. Of 202 examples in the audio-visual corpus, just over 60% were accompanied by a gesture, most commonly involving an index finger moving in a circular motion. With the example [all the way from X PREP Y], the phrase all the way from preceded a noun phrase that included a prepositional phrase (e.g., Long Beach to Lancaster). Around 80% of the 199 examples in the dataset were accompanied by a gesture in which the hands initially point to one location in the space around the speaker, then move across to point to another location in space.

Given the relatively high frequencies of specific co-speech gestures with these spoken language phrases, Zima (2014) proposes that this provides evidence that these multimodal constructions are at least partly entrenched as a unit in the minds of these speakers and may be partly conventionalised in the speech community. Construction Grammar proposes two important factors that reflect both this individual entrenchment and socio-cultural conventionalisation of constructions: (1) recurrence and (2) idiosyncrasy. Recurrence refers to the fact that frequency of usage leads to an individual perceiving such co-occurrences as a relatively fixed combination of form and meaning that is stored in memory as a unit. For example, the spoken English motion constructions from Zima (2017b) occurred with gesture 60–80% of the time in the corpus. As a result of recurrence, specific formal and/or semantic/pragmatic properties come to be associated with these units, sometimes in a way that cannot be attributed to the compositional properties of its components. This is the second factor reflecting entrenchment and conventionalisation: idiosyncrasy. Idiosyncrasy means that individual constructions have specific characteristics and that generalising across a class of constructions is not possible. It does not mean, however, that such characteristics are random. For example, specific semantic uses of English motion constructions, such as [all the way from X PREP Y] to refer to actual distance (e.g., all the way from Long Beach to Lancaster) further encouraged use of co-speech gesture (i.e., 86% of all instances of this construction with this meaning were accompanied by gesture) in Zima’s (2017a) study, whereas temporal (e.g., all the way from preschool to college) or metaphoric uses (e.g., all the way from a slow walk to a really fast run) ranged from 56% down to 33%.

3.2.2 Indicating verbs as unimodal constructions of morphemes and deictic gesture

We propose that indicating verbs are constructions in the sense of Goldberg (1995; 2003) – i.e., conventionalised pairings of form and meaning that consist partly of a monomorphemic sign specified for a particular handshape, orientation, and movement combination that has specific phonological, morphoysyntactic, semantic and discourse properties, and partly of a deictic gesture which has its own pragmatic properties. Figure 7 shows a slightly revised version of the diagram in Figure 6 above, incorporating deictic gestural properties into the form. Crucially, elements of form can occur in any modality, allowing for both multimodal semiotically-diverse constructions (in the case of speech and co-speech gesture constructions of the type described above in 3.2.1) or unimodal semiotically-diverse constructions (in the case of sign and co-sign deictic gesture constructions as with indicating verbs).

Figure 7 

Revised representation of constructions.

Any linguistic pattern is recognised as a construction if some aspect of its form or function is not strictly predictable from its component parts or from other constructions (Goldberg 2003). If constructions occur with sufficient frequency, they are stored as a unit even if they are fully predictable. As mentioned above, such an account would predict two properties: recurrence and idiosyncrasy. New evidence from the BSL Corpus draws on a study of 1,436 indicating verbs in BSL conversations (for more detail about the BSL Corpus, see Schembri et al. 2013). Fenlon et al. (2018) show that indicating verbs in BSL occurred with directional modifications in approximately 70% of all tokens (68% or 692/1019 examples analysed for subject argument modification, and 72% or 911/1278 for object argument modification).7 Thus, they clearly constitute the majority of instances in the data. This is similar to the rate of recurrence found with English motion constructions from Zima (2017b) – which was 60–80%. Likewise, the study finds that indicating verbs in BSL also show idiosyncratic behaviour, with signs like PUSH less likely to exhibit directionality (5/12) than signs like PAY (20/26) (note that, although 1,436 tokens were coded, signs that showed no directional modification at all were excluded from the analysis) (Fenlon et al. 2018). It is possible that the formational features of the sign PUSH do not lend themselves to as great a potential for directional modification as the sign PAY. On the other hand, the sign OBJECT-TO, which like the sign PAY, involves the movement of the dominant hand away from the signer, never occurred with directional modification in the BSL Corpus dataset (0/7). The particular patterns with sets of verbs, or even individual verbs, has yet to be explored in any detail for BSL, but it is likely that the idiosyncratic formational features and semantics of particular indicating verb constructions are interacting with their contexts of use here. This is similar to idiosyncrasies found in the various gestural patterns identified in Zima’s (2017a; b) work. Indeed, the idiosyncrasy in the indicating verb system is such that learners need to know which signs act as indicating verbs and which do not, which forms are single, double or backwards indicating verbs, which specific verbs often show directionality and which exhibit this pattern less often, and how these variable features interact with other aspects of the language.

Thus, we argue that indicating verbs constitute a structured composite construction of sign and co-sign gesture, similar to multimodal constructions of speech and co-speech gesture that have been proposed by gesture researchers within the framework of Construction Grammar (Andrén 2010; Zima 2014). Indicating verbs exist in both schematic form (represented in Figure 8 by four types of indicating verb: regular double indicating verbs, backwards double indicating verbs, regular single indicating verbs, and backwards single indicating verbs) to substantive atomic constructions, that is, individual signs with idiosyncratic properties (represented in Figure 8 by PAY, TAKE, THANK and LEARN). The concept of multimodal constructions in English provides a foundation for an understanding of indicating verbs as constructions combining morphemic and gestural elements. The frequent combination of deictic gesture and signs certainly seems to reflect entrenchment in the minds of individual signers and conventionalisation of these combinations in signing communities, and the fact that these particular combinations vary from one sign language to the next, and show some idiosyncratic properties, also matches what would be predicted in a Construction Grammar account. The crucial difference for sign languages is that our indicating verb construction account, these are unimodal (rather than multimodal) constructions of morphemes and deictic gesture. For example, in an indicating verb like BSL/Auslan [PAYx>y], the monomorphemic stem PAY is lexically specified for handshape, orientation and movement, but the initial and final locations may involve deictic gesture and are thus variable.

Figure 8 

Schematic representation of four types of indicating verbs and substantive atomic construction examples.

Our indicating verb construction analysis is similar to proposals involving composite utterances of speech and gesture (Enfield 2009) and their application to sign languages, including the notion that pointing constructions in sign languages are symbolic indexicals (Johnston 2013). It is also similar in some ways to Wilcox and Occhino’s (2016) cognitive grammar analysis of indicating verbs as complex symbolic constructions involving a pointing device and what they refer to as Place. However, Wilcox and Occhino claim that Place is a sign language specific feature that is distinct from gesture in non-signers. They suggest that sign languages lack an equivalent to “gesture”, but they do not discuss the possibility that work on multimodal speech/co-speech gesture constructions may be relevant to sign languages, as we propose here. Because Wilcox and Occhino do not offer any new data nor make any specific predictions that could distinguish between their proposal and our unimodal indicating verb construction account, we will not address their work further here.

4 The arguments for verbs as unimodal deictic gesture/morpheme constructions

Thus far, we have considered two broad analyses of directionality in sign language verbs: one account proposes that directionality may mark person agreement and the alternative indicating verb construction account that we propose for the first time here which interprets these forms as unimodal constructions of morphemes and deictic gesture. In this section, we attempt to describe which model better fits the patterns we observe with regards to directional modification, by examining the predictions our indicating verb construction model would make. We do this in light of recent findings from sign language corpora, and also in relation to work from other domains including studies of co-speech gesture. We begin by considering recent research on pointing in co-speech gesture. We then present evidence for our indicating verb construction analysis by considering some of the arguments that are made in the literature about the nature of directionality.

4.1 Evidence from pointing in co-speech gesture

In this section, we highlight a number of studies indicating that the use of pointing in co-speech gesture can be systematic, making it a gesture category that is likely to be recruited into multimodal constructions (Andrén 2010). This work also shows how the use of pointing gestures and its relationship with speech exhibit shared properties with verb directionality in sign languages.

Work on pointing gestures by Kita (2003a), Wilkins (2003), Kendon (2004), and Cooperrider (2011) indicate that there are regularities in the use of pointing by non-signers, and that this class of gestures may interact in patterned ways with grammar, culture and conceptual structure. For example, Kendon (2004) explores systematic behaviour in the use of seven specific finger, hand and arm configurations in the co-speech gesture of British English and Italian speakers. Cooperrider (2011) finds that the use of body-directed pointing gestures by American English speakers reflects information structure in the co-occurring speech, with elements exhibiting contrastive stress significantly more likely to co-occur with pointing gestures. Wilkins (2003) claims that different pointing gestures used in central Australia by speakers of Arrernte (an index finger versus a five digits extended hand configuration) show singular/plural distinctions that are absent in co-occurring Arrernte noun phrases and reports that his failure to correctly use pointing behaviour while speaking has been raised by native speaker consultants. Additionally, Wilkins (2003) notes that speakers of Arrernte use a specific pointing gesture meaning ‘motion towards that location’ and that this contrasts with other pointing gestures indicating ‘being at a location’. Similarly, Kita (2003a) discusses claims that some communities, such as the Barai and Yimas, traditionally used lip-pointing with no recorded use of pointing with the fingers. Also, Cooperrider et al. (2018) systematically compared the responses by American undergraduates to a novel communication task with those produced by Yupno people from Papua New Guinea. Speakers in both groups pointed at similar rates, but Yupno participants used nose and head pointing more often than manual pointing, whereas the Americans only used finger pointing. In addition, Özyürek and colleagues (Özyürek 1998; Özyürek & Kita 2000; Kuntay & Özyürek 2006) have documented a demonstrative pronoun in Turkish (şu) which is unspecified for distance and is used when the addressee’s visual attention is not yet on the object referred to. Once joint visual attention is engaged (e.g., by a pointing gesture or eye gaze towards the object), then bu or o is used instead (roughly equivalent to proximal this and distal that in English, respectively). Lastly, to indicate a specific time of the day, speakers of Nheengatú point to a position along the east-west axis of the sun (Floyd 2016). This pointing behaviour not only co-occurs with spoken verb phrases consistently, it can provide more precise information than what is spoken (i.e., a time of the day) and Nheengatú speakers appear to be sensitive to incorrect variations in form and meaning pairs (e.g., when presented with other possible interpretations in an elicitation task).

Thus, as noted by Özyürek (2012), characterisations of language which only take into account aspects expressed through speech do not offer a sufficiently comprehensive view of the human language capacity. Instead both speech and gesture should be included in our descriptions of particular languages because the evidence suggests that gestures are an integral part of language. Liddell (2003) argues that this is also true of sign languages, particularly in some aspects of their organisation, such as indicating verbs. If directionality in indicating verbs is a type of co-sign gesture rather than a person agreement marker, then we predict that we will find that the use of directionality may have more properties in common with directionality in co-speech gesture than with agreement marking. It is important to note, however, that there is no real analogy for indicating verbs in the speech and gesture package as it is only in sign languages where symbolic and deictic elements occur in the same modality, and verbs may themselves be modified spatially to reflect associations with present and absent referents.

4.2 What controls directionality in indicating verbs?

The most important argument against an agreement analysis is that patterns of modification appear to be explained by factors other than that which would be predicted under an agreement analysis. Some sign language linguistics scholars propose that the directionality in indicating verbs (i.e., which way signs are directed in space) is a morpheme that marks the grammatical person of the verb’s arguments (Padden 1983; Lillo-Martin & Meier 2011). We will focus here on these accounts, rather than others which explore issues of how to determine which verbs participate in this directional modification (e.g., Janis 1995; Meir 1998; 2002), which – like Padden (1983) and Liddell (2003) – we assume to be lexically determined under a Construction Grammar account. In person agreement analyses (first proposed by Padden 1983), first person is associated with locations on the signer’s body, second person with the addressee’s location, and third person is either the location of some physically present referent that is neither the signer nor the addressee, or at some location in space associated with an absent referent (an “R-locus”). Other analyses claim that there is only a distinction between two persons: first and non-first person. This is because reference to first person is always associated with the signer’s body, but second and third person reference can vary: both the addressee and non-addressed participants may be associated with a number of locations in the signing space around the body (e.g., Meier 1990; Lillo-Martin & Meier 2011). Padden (1983) suggests that the manner in which third person arguments are marked when the referent is absent depends on a number of conditions, including that the third person referent is associated in some way with a specific location in the space around the signer’s body. For example, in the BSL/Auslan example in (12), the third person subject argument WOMAN precedes a pointing sign that is directed towards a particular location on the signer’s right. This works to create an association between the referent of the noun phrase WOMAN and this location on the right side of the signing space. The verb sign SEND is then produced at the same location on the signer’s right, and directed towards another location away from the signer, resulting in a clause meaning ‘the woman sends flowers to someone’. In this analysis, directing an indicating verb from the initial location assigned to the subject noun phrase to some other location (not here assigned to a particular object argument) is considered analogous to adding an agreement affix, as in the Italian examples in (3) and (4) above.

(12) WOMAN PT→R SENDR→L FLOWER
  ‘The woman sends flowers to someone’.

In (12), signs meaning WOMAN in BSL and Auslan have a fixed location on the body (in one lexical variant in BSL, for example, the extended index finger strokes the cheek), and thus the sign is associated with a locus in the signing space by the use of a pointing sign that follows it. Alternatively, some nominal signs in BSL and Auslan do not have a fixed location on the signer’s body, such the BSL/Auslan sign CHILD. With this sign, it is possible to produce the sign in a particular locus rather than use a pointing sign to associate it with the locus. Thus, in (13), there is no pointing sign as part of the noun phrase, and the directionality of the sign TELL involves the use of the same locus as the sign CHILD to create a clause meaning ‘the father tells his child’ (in this example, a right-handed signer has moved the sign child away from the citation form’s ipsilateral location to one on the left).

(13) FATHER TELL→L CHILD↓L
  ‘The father tells his child (on the left).’

In (13), we see the closest approximation in sign language indicating verbs to Steele’s (1978) definition of agreement, in that some formal property of this particular instantiation of the BSL/Auslan sign CHILD (i.e., the fact that it is produced here at a locus on the signer’s left) co-varies with some formal property of the associated verb sign TELL→L.

Liddell (2000; 2003) argues, however, that the directionality of indicating verbs is ultimately controlled by the real or imagined location of the referent, not by any feature that might be construed as a formal or semantic property of a controller noun phrase. Note this does not mean that the location of the referent is not a semantic property per se. Previous work e.g., by Schlenker et al. (2013) shows quite clearly that indicating verbs can be used to create an understanding of the location of the referent of a controller noun phrase, and these understandings are part of the meaning of an utterance. Instead we interpret Liddell’s (2000) claim here in light of Corbett’s (2006) definition of agreement to mean that the location of a referent is not reflected in some relevant grammatical feature of the controller in sign languages, like person, number or gender in spoken language agreement systems. There is no evidence that all the nouns in a language like BSL have an inherent grammatical feature of location, with a fixed set of values, as required by Corbett’s (2006) model of agreement systems provided above. Instead, both the use of space by signs within the noun phrase itself, such as the directionality of the pointing sign in example (12), and the spatial displacement of the noun CHILD in example (13), are best analysed as being controlled by the mental representation of the spatial location of the referent. Consider the BSL/Auslan clause in (14). As is also true of ASL (Liddell 2000), if this were produced in the presence of the referent of the object argument in question (i.e., if the actual mother being referred to was standing on the right of the signer and the addressee), then the sign ASK→R would begin its movement at the chin and move rightwards towards the location of the mother standing nearby. We can see that, in this instance, the location of the referent of the object noun phrase MOTHER is not a formal nor a semantic property associated with the noun phrase but is instead a transient property of the referent. It is unlike the mother’s gender, for example, which is a property of the referent that is reflected in the lexical semantics of the noun itself in the controller noun phrase that represents the referent in the clause. Thus, it is not the case that spatial modification of the indicating verb ‘agrees’ with any of the linguistic properties of the relevant noun phrase (i.e., its role as an object argument in the phrase, its singular number, the gendered semantics of the noun etc.), and it certainly cannot be said that it ‘agrees’ with the location of the actual referent herself, as the location of the mother in the real world is a property of the referent (because the mother can move to another location), and not a formal nor semantic property of the lexical item MOTHER in BSL/Auslan. Thus, there does not appear to be the same type of covariance here as we see outlined in Corbett’s (2006) definition.

(14) PRO→1ASK→R MOTHER
  ‘I asked mother (something).’

Unlike the examples in (12) and (13), in many instances there is no relationship between the locus towards which an indicating verb is directed and any properties of the associated noun phrase. In the clause in (14), the specific lexical variant of the sign MOTHER is produced on the ipsilateral side of the forehead. The sign ASK→Y, however, may be directed to any location away from the signer associated with the referent of MOTHER, and not at the location of the sign MOTHER at all. Furthermore, as first discussed by Liddell (2000), if the signer is representing a child asking his or her own mother, then the relative height of the child’s mother in relation to the child may be represented by directing the sign ASK→Y away and up from the signer’s body, as in Figure 9. Thus, the locus towards which the verb ASK→Y is directed here does not reflect any formal property of the associated noun phrase at all (i.e., it is not directed to the ipsilateral forehead location of the sign MOTHER), nor any of its inherent semantic properties (although it can trigger a presupposition about the mother’s relative height, cf., Schlenker 2013), given that the height of the individual concerned is not part of the semantics of the lexical sign MOTHER. Furthermore, in this example, the height of the referent associated with MOTHER is a relative phenomenon, expressed in relation to the signer’s body. Thus, the actual location is determined not by the referent of a single controller, but by relative height of the body of the referent represented by the argument PRO→1 and the imagined size of the absent referent of the object argument represented by the sign MOTHER.8 We can see this illustrated clearly in example (15) below. If the absent referent of the sign FATHER is taller than the mother, then in this case, the sign may be directed downwards to the right from the signer’s body. The actual referent of MOTHER has not changed in physical height, but she is represented as taller than the referent of PRO→1 in example (14) and as shorter than the referent of father in example (15).

(15) FATHER TELL→R MOTHER
  ‘Father told mother (something).’

Figure 9 

BSL/Auslan ASK→Y.

Indeed, this is also the case for example (16) in which the indicating verb SEND is directed from the imagined location of the absent referent of the subject argument WOMAN and the imagined location of the absent referent of MOTHER. Again, we see two locations that are not associated with the referents of a single controller argument, but of both referents involved in the event.

(16) FLOWER, WOMEN SENDR→L MOTHER
  ‘The woman sent mother flowers.’

Costello (2016) acknowledges that omissions of explicit location assignment (i.e., by the use of pointing sign or displaced nominal) that we see in example (16) are possible, despite the fact that this results in constructions that lack the defining feature of co-variance in Corbett’s (2006) notion of agreement. Costello relies, however, on the notion of linguistic economy to explain this. While economy of effort may partly explain why no explicit location assignment is made, it does not explain what may control the actual choice of location. If one assumes that the directionality is not controlled by semantic or formal features of a single controller noun phrase, but by the location (real or imagined) of the referent(s) of variably present subject and/or object noun phrase(s), then all possible forms of location assignment, and the lack of it, can be predicted by the same mechanism.

Liddell’s (2003) account also appears to be supported by new evidence coming from a corpus-based study of indicating verbs in BSL mentioned above. Based on the claim that signers are directing indicating verbs towards locations associated with present referents, or absent referents imagined to be present, this model should predict an interaction with related sign language phenomena, such as constructed action (i.e., as explained above, the use of articulators such as the head, face or the body to mimetically represent a referent’s actions, utterances or feelings). During periods of constructed action, signers appear to be interacting with absent referents as if they are physically present within the signing space as in Figure 10 where the sign LOOK is produced with constructed action. Fenlon et al. (2018) examine the co-occurrence of constructed action with verb modification in the BSL Corpus. Constructed action is important in predicting modification of indicating verbs for marking object arguments: we find that the presence of constructed action significantly favours object modification while the absence of constructed action disfavours modification. The results are similar to previous work in Auslan by de Beuzeville et al. (2009) who also finds that constructed action plays an important role in predicting verb modification. Fenlon et al. (2018) interpret these findings as lending support to Liddell’s (2000) analysis of these verbs as a fusion of morphemes and deictic gestures. The fact that we observe an increased likelihood of modification during periods of constructed action suggests that signers may be interacting with imagined referents as if they were physically present (see also Cormier et al. 2015b). In contrast, those who propose an agreement analysis of this kind of data offer no account of this phenomenon. Costello (2016: 262) in fact makes the opposite prediction, suggesting instead that the presence of constructed action in the clause may explain why some indicating verbs do not exhibit directionality, though no data are offered to support this prediction.

Figure 10 

BSL indicating verb LOOK with constructed action.

Additionally, data from the BSL Corpus also suggests that modification of indicating verbs generally reflects a signer’s own perspective of events. As Fenlon et al. (2018) show, this is supported by the fact that clauses in the BSL Corpus dataset involving a first person argument significantly favour modification for both subjects and objects in contrast to clauses that do not involve a first person argument at all. In Figures 11 and 12 taken from the BSL Corpus, Figure 11 demonstrates modification for the object argument while Figure 12 does not. A crucial difference here is that Figure 11 involves a first person object while Figure 12 does not. This suggests that signers are not simply modifying verbs from one location in space to another but that they are frequently using their body to represent a referent of one of the arguments and imagining how an action is carried out from this referent’s perspective. Indicating verbs, therefore, may be best explained with reference to mental spaces (Liddell 2003; Janzen 2004). It is not clear, however, how an agreement analysis would account for this pattern (together with patterns from constructed action).

Figure 11 

BSL GIVE-INFORMATION modified for a third person subject and first person object.

Figure 12 

BSL TEASE unmodified for the subject or object.

A number of researchers discuss the pointing nature of the directionality in indicating verbs, while maintaining support for an agreement analysis (e.g., Lillo-Martin & Meier 2011; Rathmann & Mathur 2011; Schlenker 2011; Mathur & Rathmann 2012; Schlenker In press). Schlenker (2011: 234), for example, outlines a formal semantic analysis that integrates the deictic properties of both pronominal signs and indicating verbs into “…the larger domain of anaphoric constructions in natural language” (cf., Kibrik 2013). Schlenker argues that this analysis is compatible with an agreement analysis, but it is equally compatible with the indicating verb system as a reference tracking device that does not actually mark person agreement. Schlenker (In press) and Schlenker and Chemla (2017) extend this analysis to integrate directionality in sign languages and directionality in co-speech gestures. However, incorporating gesture into formal analyses still does not address the issue that associations with the location of a present referent, or with an absent referent conceptualised as present, do not represent semantic or formal properties of the noun phrases that represent the referent, as pointed out by Liddell (2011). Even among those publications which recognise some role for deictic gesture within the indicating verb system (e.g., Lillo-Martin & Meier 2011; Rathmann & Mathur 2011; Schlenker 2011; Mathur & Rathmann 2012), there is little if any reference to research on co-speech gesture. Moreover, Liddell’s (2003) account of grammar, gesture and meaning in ASL provides an account for the full range of possible spatial modification of lexical elements in sign languages (such as those in pronominal signs and noun signs, described above), some of which are left unexplained in the work by those arguing for an agreement analysis (e.g., Lillo-Martin & Meier 2011; Rathmann & Mathur 2011).

5 A review of arguments in the sign language verb agreement literature

In this section, we extend our argument for indicating verbs as typologically unique constructions by examining a range of evidence in the literature on verb agreement. This includes a discussion of conventionality of these constructions within/across sign languages, the role of syntax, and further evidence from acquisition, neuroscience, emerging sign languages, grammaticalisation, and sociolinguistic variation and language change. We end this section by showing how our unimodal Construction Grammar analysis provides a unified account of many phenomena widely considered by others to be instantiations of agreement in sign languages.

5.1 Patterns of directionality in indicating verbs within/across sign languages

One argument for directionality as agreement reflects the suggestion that if the directionality of indicating verbs was gestural, then one would expect to see considerable variation with respect to the locations towards which indicating verbs may be directed (Aronoff et al. 2000; Meier 2002). Yet, the directionality of indicating verbs appears to be constrained: for example, the ASL indicating verb GIVEX→Y is directed towards locations associated with the referent represented by the indirect object and subject noun phrases, but not to the location associated with the referent of the direct object (Aronoff et al. 2000; Meier 2002; Lillo-Martin & Meier 2011).

The nature of the directionality of the sign GIVEX→Y appears to be, in fact, at least partly predictable from its semantics, and has little relation to whether or not the directionality is itself gestural in nature. The sign GIVEX→Y refers to acts in which individuals transfer ownership of an item from themselves to another person. This transfer does not occur between the theme of the verb and the recipient, so there is no reason to think that the path movement in the sign would work this way. Padden (1983) contrasts the transfer within ASL GIVEX→Y with a similar sign, which she called PASS-BY-HAND, which could in fact move from the location associated with the theme to the recipient’s location, as it mimetically represents what the hand is doing when someone picks up an object and hands it to another person. In fact, Taub (2001) argues that much of the directionality of indicating verbs in ASL is iconically-motivated as it typically represents physical interactions between animate referents, or more abstract interactions represented metaphorically as if they were a physical interaction. Taub (2001) explores how each case involves different trade-offs between formational features, iconicity and metaphor (see also Meir et al. 2007; 2013). It is the different combinations of each factor, together with verb semantics, which result in cross-linguistic differences we see across sign languages. This may partly explain some of the constraints that Aronoff, Meir and Sandler (2000) discuss. For example, although the sign BELONG-TO/OWN in BSL is a stative verb that does not involve a typical interaction, this sign is an indicating verb that can move between the locations associated with the theme and goal. Similar arguments can be made for backwards verbs – i.e., verbs that move from object to subject (e.g., TAKE, INVITE, BORROW, etc. in many sign languages) – which are proposed to be potentially problematic for agreement analyses (cf. Quadros & Quer 2008; Lillo-Martin & Meier 2011). This argument in Aronoff, Meir and Sandler (2000) also appears to assume that pointing gestures are themselves not conventionalised nor constrained in any way. As in §4.1, this does not appear to be the case.

Meier (2002) and Lillo-Martin and Meier (2011) point out that the set of indicating verbs within a language cannot be predicted from formal or semantic properties alone (e.g., in ASL, HATEX→Y is an indicating verb while LIKE, which involves a movement away from chest, could be but is not an indicating verb). They also note that the set of indicating verbs differs cross-linguistically. For example, the sign EXPLAIN→Y is an indicating verb in BSL/Auslan, but a sign that might be glossed as EXPLAIN in ASL is not. Additionally, some sign languages, such as German Sign Language, use a person agreement marker (see Steinbach & Pfau 2007; Steinbach 2011). This is a non-specific indicating auxiliary verb that is used in combination with a plain verb to indicate who does what to whom.

Although we argue that indicating verbs represent a fusion of a morpheme with a pointing gesture, this does not mean that which verbs are indicating verbs should be entirely predictable nor that they will not vary cross-linguistically. Our indicating verb construction model would predict that the way that the set of indicating verbs behaves in each language is conventionalised, as language-specific constructions, and this is indeed what we find. Liddell (2003) proposes that each sign language’s set of indicating verbs and/or auxiliaries and their properties are listed in the mental lexicon (in our model as constructions, as proposed by Construction Grammar), and thus they may vary from one sign language to the next. In fact, this is something that may not be unique to signed or spoken language, as the use of pointing gestures used by non-signers may also vary from culture to culture, as noted in §4.1.

Additionally, Meier (2002), Lillo-Martin and Meier (2011) and Cormier (2002; 2007) argue that expressions of numerosity in the verbal and pronominal systems of ASL and BSL affect the interaction with pointing, and that this may also have bearing on the status of directionality in sign languages. Again, whether an indicating verb can point and/or express numerosity, how it points, what form that takes phonologically and/or how it shows number are all conventionalised aspects of specific indicating verb constructions. The key issue here in deciding whether we have evidence for an agreement system is whether the directionality of the pointing itself (i.e., the varying features of the target) does or does not reflect semantic or formal properties of a controller noun phrase.9

In summary then, our indicating verb construction proposal has advantages over rule-based models, as it does not need additional mechanisms to explain why only a subset of verb signs are indicating verbs, and why there is variation in how directionality is realised in the system (both with individual verb and classes of verbs, such as regular and backwards indicating verbs).

5.2 Syntactic properties of indicating verbs

Meier (2002), Sandler and Lillo-Martin (2006), Lillo-Martin and Meier (2011) and Wilbur (2013) point out in various ways that the use of indicating verbs appears to have syntactic consequences, and thus must be represented in the syntax of individual sign languages. They claim this would not be predicted by a model which incorporated deictic gesture. For example, Quadros and Lillo-Martin (2010) argue that constituent order interacts with the use of indicating verbs in ASL and Brazilian Sign Language. Their research suggests that, although the basic constituent ordering in these languages appears to be subject-verb-object in clauses with plain verbs, constituent ordering appears more flexible in clauses with indicating verbs. In particular, the orders subject-object-verb and object-subject-verb appear to be acceptable in these clauses. This has since been found to be a more general pattern across sign languages (Napoli & Sutton-Spence 2014). Furthermore, in Brazilian Sign Language, indicating verbs also interact with the order of negative signs. Clauses with indicating verbs are reported to take a preverbal negator, whereas those with an indicating verb have a negator in clause final position (Quadros 1999). Lillo-Martin (1986) and Glück and Pfau (1998) also claim that, as in agreement systems in spoken languages, the presence of an indicating verb in a clause licences null arguments in ASL and DGS.

Indeed, recent work on indicating verbs in the BSL Corpus dataset suggests that directional modification of these verbs does have syntactic consequences (Fenlon et al. 2018), particularly the verb’s position in a clause. In this study, we found that verb final and verb-only clauses significantly favour modification (or seem to be neutral in this respect) for the subject and object arguments while non-final verbs consistently disfavour modification. The finding is consistent with similar reports for ASL (Friedman 1976). It may also be related to claims for ASL by Fischer (1975) (where modified verbs are also reported to prefer final clause position) that referents often need to be established in signing space prior to the articulation of the modified verb (claims echoed by Napoli & Sutton-Spence 2014). However, this does not appear to be the case in the BSL data as clauses frequently omitted arguments in clauses, and thus the clause final position favouring modified indicating verbs cannot be fully explained by the need for locative assignment. Where arguments were overt, the significance of phrase-final position may be linked to the fact that this position plays a special role in many sign languages in both form and function (Crasborn et al. 2012) and that the prosodically heavy nature of phrase-final position interacts with the role of prominence (Wilbur 1999).

In terms of the licencing of null arguments, research on variable subject expression suggests that the predictions of the agreement analysis are not born out by the available data from some of the sign languages we work on. While studies by McKee et al. (2011) indicate that agreement verbs do indeed slightly favour null subjects in Auslan and New Zealand Sign Language (with 57% and 54% respectively of all tokens occurring with no overt subject argument) as proposed by Lillo-Martin (1986) and others, this research also shows that spatial verbs (73% for Auslan and 66% for New Zealand Sign Language) and plain verbs (60% and 53%) also occur most often without an overtly expressed subject argument. It is thus not clear how the agreement analysis accounts for these patterns, and the data suggest that other factors such as co-reference and structural priming (see McKee et al. 2011 for more detail) are also important in influencing variable subject expression (predictions about structural priming would fall directly out of a usage-based constructionist model such as the one we are proposing here).

Overall, we can see that the use of indicating verbs does appear to have important syntactic effects (although perhaps their role in null arguments has been overstated). There appears to be no a priori reason to assume, however, that the agreement analysis is the only account able to explain this, as Meier (2002), Lillo-Martin and Meier (2011) and Wilbur (2013) claim. After all, it is only the patterns in the use of directionality in indicating verbs that are influenced by the real-world location of present referents or imagined locations of absent referents, and not other linguistic properties of these verbs. In fact, there is a good deal of evidence to suggest that the grammar of individual spoken languages and co-speech pointing gesture also interacts in language-specific ways, as noted in the following sections. Within a Construction Grammar model, indicating verb constructions are represented as conventionalised clusters that include both morphosyntactic and gestural features: they are part of the grammar, and thus their interaction with aspects of the syntax of sign languages is unsurprising.

5.3 The acquisition evidence

Meier (2002) claims that the mastery of directionality of indicating verbs for present referents is not reached until around age 3, and that mastery of this system for absent referents is later still. Meier notes that this is similar to the acquisition of complex morphological systems (Slobin 1985) and uses this extended time course of development of these aspects of ASL grammar to argue for the morphological status of directionality in indicating verbs.

However, this does not take into account the time course required for acquisition of co-speech gesture, particularly mastery of the relationship between language and co-speech gesture. As noted by Gullberg, de Bot and Volterra (2008), the development of the adult speech-gesture system is not yet fully described and more studies are needed to understand how this emerges. Some studies do suggest that acquisition of speech and gesture may not be as simple as might otherwise be assumed. In particular, it seems that deictic pointing gestures may not be acquired at the same time or in the same way as other types of gestures. For example, Mayberry and Nicoladis (2000) followed five French-English bilingual children from ages 2;0 to 3;6 and found that the use of iconic and beat gestures correlated with speech development, whereas pointing gestures did not. Morgenstern (2014) also reports this, with a relatively unchanging rate of pointing gesture use in child language data (around 5-10% of all utterances at this age are accompanied by pointing gestures, which is similar to the adults in her study). The use of pointing gestures changes during development, however. Until 24 months, pointing at present referents predominates, but pointing to absent referents begins to develop from this age. By 30 months, there is use of pointing to more than one location to represent different referents during narratives. This continues to expand as children get older, with Colletta (2004) finding that children aged 6 and over use more abstract deictic gestures (and also metaphoric and beat gestures) than younger children who struggle with these types of gestures. Additionally, in their study on the acquisition of the Turkish demonstrative şu described in §4.1, Kuntay and Özyürek (2006) find that six-year-old children who had acquired adult-like use of the proximal and distal pronouns bu and o had not yet acquired adult-like use of şu. They attribute this to children’s still underdeveloped ability to combine this demonstrative pronoun in speech with conversational management of visual attention required by appropriate use of şu. Likewise, work on first language acquisition of multimodal constructions in hearing children involving negation suggests that children take time to develop adult-like coordination of headshake with spoken utterances of negation (Andrén 2014). Given the complexity of acquisition of speech along with deictic co-speech gestures, we see some evidence of parallels here with the acquisition of indicating verbs in sign languages: children appear to pass through similar developmental stages in the acquisition of co-speech pointing and verb directionality, with the use of directionality with present referents, for example, often reported to precede its use with absent referents (Chen Pichler 2012).

5.4 Neurolinguistic evidence

There are very few studies in the neurolinguistics literature that can contribute to our understanding about the nature of the use of space in indicating verbs. In one study by Capek et al. (2001), deaf native ASL signers were shown sentences containing indicating verb errors. This ERP study found left hemispheric activity in these participants similar to that seen in hearing people reading or listening to syntactic violations in English. However, indicating verb errors in which the verb was directed to a new location, not previously associated with a referent, elicited bilateral responses, suggesting a unique involvement of spatial processing in ASL syntax. This finding that signers’ visual-spatial processing is activated may reflect Liddell’s (2003) claim that this is because they are imagining a new referent as physically present.

More recent work on the relationship between gesture and spoken languages is relevant here. Xu et al. (2009) find that both symbolic gesture and spoken words activated a common, left-lateralised network of inferior frontal and posterior temporal regions of the brain. They speculated that gestures and spoken words may be mapped onto common, corresponding conceptual representations, and are not, in fact, as distinct as one might otherwise imagine. Recent work by Newman et al. (2015), however, appears to challenge this. In their study, they found both sign language and silent gesture stimuli activated classic left-lateralised language centres in deaf signers; non-signers showed activation only in areas attuned to human movement. Furthermore, in signers, sign language stimuli activated left hemisphere language areas more strongly than gestural sequences. Thus, the authors conclude, sign language constructions engage language-related brain systems and are not processed in the same ways that non-signers interpret gesture. Wilcox and Occhino (2016) and Occhino and Wilcox (2017) use this evidence from Newman et al. (2015) to suggest that the use of phenomena such as verb directionality in sign languages cannot involve gesture because gesture and language appear to activate identical regions in the brains of signers, but not in those of non-signers. In other words, Wilcox and Occhino claim that the evidence provided by Newman et al. (2015) suggests that there cannot be gestural elements in sign languages. However, the study by Newman et al.’s (2015) focussed on sign language verbs of motion compared to silent gesturing, not co-speech gesture. Thus, it is not clear what relevance this has to our claim that morphemic and gestural elements in indicating verbs work closely together, analogous to multimodal (speech and co-speech gesture) constructions in spoken languages. What is also important here is that Newman et al. (2015) found that even actions recognised as non-linguistic by signers were processed in language-like ways. It has already been established that deaf signers’ lifelong exposure to sign languages results in neural reorganisation that leads to changes in motion processing in general (e.g., Bosworth & Dobkins 2002), so it is not clear if what we have here are findings that can really distinguish between brain activation in response to linguistic or non-linguistic phenomena (cf., Van Lancker Sidtis 2006). It is also worth noting that the neural processes involved in gesture production and comprehension in non-signers are themselves not yet well understood (Husain et al. 2009; Emmorey & Özyürek 2014), and that few studies take into account the fact that different gesture types might work differently, with some more linguistically entrenched than others. Overall, neurolinguistic studies conducted thus far appear inconclusive with regard to the gesture versus sign distinction, and thus we question the relevance of this evidence to the debate about the role of gesture in sign language indicating verbs.

5.5 Emerging sign languages and grammaticalisation

Various studies have provided evidence that indicating verb systems arise in fully-fledged sign languages. For example, some studies suggest that directional verbal gestures emerge in home sign systems and in modified forms of Signing Exact English used by deaf children not exposed to ASL, but appear under-developed compared to ASL (Supalla 1991; Goldin-Meadow et al. 1994). Directionality is also seen in studies of hearing non-signers when asked to gesture in a no speech condition (Casey 2003) and recent studies have tracked the evolution of such systems in laboratory settings (e.g., Motamedi et al. 2017). Additionally, there are reports of language change in younger versus older signers of established sign languages (e.g., Engberg-Pedersen 1993 for Danish Sign Language), perhaps comparable to the ongoing grammaticalisation of gonna in English (Tagliamonte & D’Arcy 2009). Note, however, that the BSL Corpus study of the use of indicating verbs failed to find any evidence of the increasing use of modification across different age groups in BSL (Fenlon et al. 2018), suggesting that the indicating verb system is stable in this older, established sign language. In contrast, indicating verbs are claimed to be much more frequent and systematic in sign language “creoles” than “pidgins”, as seen in the first and second cohorts of Nicaraguan Sign Language users (Senghas & Coppola 2001), and in younger sign languages as they develop through different stages – e.g., Israeli Sign Language (Meir 2016). Indicating verbs are apparently rare in some “village sign languages” (i.e., sign languages used by deaf and hearing members of a small close-knit community), such as Al-Sayyid Bedouin Sign Language, Providence Island Sign Language, and Kata Kolok (Aronoff et al. 2004b; Marsaja 2008; de Vos 2012; Nyst 2012). In at least one case, the lack of indicating verbs is accompanied by lack of referential pointing to absent human referents in the language generally: de Vos (2012) reports that Kata Kolok signers prefer establishing and maintaining reference via pointing to fingers on the non-dominant hand (i.e., list buoys, see Liddell 2003) rather than pointing to locations in space. In the case of Al-Sayyid Bedouin Sign Language, signers appear to point to locations in space for reference and they modify verbs spatially to represent actual motion and location, but it is reported that they do not use space for verbs of transfer (Aronoff et al. 2004b). Quer (2011: 195) finds it paradoxical that Al-Sayyid Bedouin Sign Language has developed locative marking but not directional marking on verbs in this way. Quer notes: “one could expect that such a basic gesture as pointing would be easily grammaticalised, contrary to fact”. This kind of assumption is what Cooperrider (2011) considers an example of Fauconnier’s illusion of simplicity. Similarly, Wilbur (2013) details many examples of the use of pointing in sign languages, without exploring in detail how many similar uses have been documented in co-speech gesture. Based on all the evidence noted above, there is no reason to necessarily expect indicating verb constructions (as fusions of morphemes and gestural elements) to emerge quickly (or at all) in all sign languages, given that the development/emergence of pointing gestures with or without speech certainly does not “come for free” in non-signers, and the fact that a small number of communities (e.g., Barai in Papua New Guinea, see Wilkins, 2003) are reported to not use index finger pointing at all.

All of the patterns noted above suggest that indicating verbs (1) develop as pointing gestures, (2) are incorporated into verb signs as part of an emerging linguistic system (see Meir 2016 for an interesting proposal that depiction, rather than deixis, is what drives this development), and (3) may continue to develop through analogic processes of language change (although this process may slow considerably in older sign languages like BSL). Increasing conventionalisation provides evidence of an emergent indicating verb construction system in the grammar, but not necessarily an agreement system. Agreement systems generally emerge by means of a related, but distinct, grammaticalisation process from what we see in the emergence of indicating verb systems (Givon 1976; Corbett 2006). Grammaticalisation theories rely on recurrent patterns that regularise and become conventionalised and entrenched over time; this is one of the basic tenets of usage-based approaches to language, including Construction Grammar (Traugott 2008; Gisborne & Patten 2011). The typical grammaticalisation path for agreement systems is that full pronouns become cliticised onto verbs, and then later these cliticised pronouns become inflectional affixes. Although cliticisation has been claimed to be a possible source of or explanation for verb directionality in sign languages (Meier & Lillo-Martin 2011; Nevins 2011), there is actually no evidence that a grammaticalisation pathway involving an intermediate stage of cliticisation is followed in sign languages (cf., Liddell 2003: 72), at least not for indicating verbs.10 Many first person forms in particular are clearly not the result of the fusion of the first person pronoun PRO→1 (directed to the centre of the signer’s chest) and a verb. For example, BSL/Auslan REMIND→1 moves from a forehead location to one on the ipsilateral shoulder, BSL/Auslan EXPLAIN→1 reverses its alternating circular movement in neutral space; and BSL/Auslan LOOK-AT→1 is directed to a location on the signer’s face. Likewise, ASL CONVINCE→1 is directed toward the neck. None of these first person object forms are directed towards the location of the first person pronoun, i.e., centre of the signer’s chest. Furthermore, in Japanese Sign Language, despite the fact that the first person pronoun is directed towards the nose, rather than the chest, indicating verbs do not move towards the face to represent first person object arguments (Mathur 2000). Pronouns, indicating verbs and constructed action all involve similar iconic and deictic uses of gestural space (cf, Meir 2016). The indicating verb system probably develops by means of analogy, as different verbs that share formational features take on the ability to be spatially modified over time. In Figure 8, we can see a specific indicating verb constructions that fits into the Vx>y schema: PAYx>y. Another sign in this category would be TEXTx>y. The latter indicating verb, used to refer to sending a text message to someone, appears to have developed from the (nominal) sign meaning ‘mobile phone’ or ‘SMS’ in which the hand moves in a small circular motion as the thumb bends. The new indicating verb emerges by analogy with existing constructional schemas provided by the indicating verb system (cf., Lepic & Occhino 2018). In it, a single bending movement of the thumb occurs as the sign moves from a location associated with the subject argument towards another associated with the object, in the same way as PAYx>y.

5.6 Sociolinguistic variation and language change

Signing communities are sociolinguistically very different from spoken language communities, due to the very low numbers of native signers in most communities and related to this, interrupted transmission across generations (e.g., Schembri & Johnston 2007). This leads to much apparent idiosyncratic variation with respect to all aspects of language use, including morphology. As relatively young languages (Newport & Supalla 2000), many of the morphosyntactic properties of sign languages do not appear to be highly grammaticalised and thus are often optional. For example, large scale quantitative studies of indicating verbs in data from both BSL and Auslan show that the use of spatial modification in such signs to reflect the location of real or imagined referents occurs in 60–70% of all tokens (de Beuzeville et al. 2009; Fenlon et al. 2018). As noted above, idiosyncrasies such as these are to be expected under a Construction Grammar account, as different individual indicating verb constructions will have different properties.

Additionally, Engberg-Pedersen (1993) proposes that these modifications also interact with language-internal factors such as the frequency of a lexical unit generally, its frequency of occurrence in a context or linguistic construction that is typical of the modification, and the sign’s semantic or formal characteristics when modified. As noted above, frequency – i.e., recurrence – is another key characteristic of Construction Grammar models. The research on indicating verbs in Auslan and BSL relating to frequency is mixed. De Beuzeville et al. (2009) found that different Auslan verbs were modified at different rates, with high frequency forms (e.g., LOOK, SAY, COME, ARRIVE, GO) showing spatial modification significantly more often than less frequent verbs, supporting Engberg-Pedersen’s (1993) claim. Lexical frequency was not found to be a significant predictor of modification of indicating verbs for BSL (Fenlon et al. 2018), although only a subset of indicating verbs were the focus of the BSL study.

Findings relating to indicating verbs and constructed action have been clearer – i.e., indicating verbs in both Auslan and BSL were significantly more likely to be modified for the object when co-occurring with constructed action. As noted above, such a correlation has been found both with Auslan (de Beuzeville et al. 2009) and BSL (Cormier et al. 2015a; Fenlon et al. 2018). Corpus-based approaches such as these will assist us in identifying these language-internal and external influences and thus enable us to more accurately characterise sign language grammars.

6 Accounting for the role of eye gaze: a unified account

As noted above in §3.1.3, other phenomena beyond spatial modification have also been proposed as involving agreement in sign languages – including eye gaze towards object argument locations during verb production, and also eye gaze and other non-manual features used during role shift, or constructed action.

One problem with some of these studies is that – in the case of eye gaze shift with verb modification – other possible explanations for this shift in eye gaze are acknowledged (Thompson et al. 2006) but not fully explored. Thompson et al. claim that eye gaze marking the point of view of a referent and which imitates the gaze of the referent cannot account for the patterns that they find with agreement verbs, because they suggest that Liddell’s (2003) account would predict that directed eye gaze should also co-occur with plain verbs and pronouns, and yet it is not found in their data. However, Liddell (2003) does not make this claim, and in fact there is no reason to make this prediction. Instead, it is likely that the use of space in indicating verbs when referring to absent referents, in which the hands are directed towards a location associated with an imagined referent, triggers eye gaze patterns that imitate the subject argument’s gaze because it involves a greater degree of enactment than is required for plain verbs and pronouns.

This analysis is entirely consistent with recent corpus studies of verb modification in sign languages. As reported above, Fenlon et al. (2018) find that in an analysis of indicating verbs from the BSL Corpus, whether or not verbs were spatially modified significantly favoured co-occurrence of constructed action, following criteria set out by Cormier et al. (2015b). Similarly, de Beuzeville et al. (2009) also found that modification of indicating verbs significantly favoured the presence of constructed action in Auslan, using one or more of Engberg-Pedersen’s (1993) notions of shifted attribution of expressive elements, shifted reference, and/or shifts of the body, head or gaze, whereby signers take on characteristics of referents in the discourse. Although de Beuzeville et al.’s (2009) notion of constructed action was not restricted to shifts in eye gaze for the purpose of constructed action, eye gaze was not included in that study as a possible marker of constructed action. Given the cross-linguistic similarities in the use of constructed action across sign languages (de Beuzeville et al. 2009; Lillo-Martin 2012), it seems plausible that the use of eye gaze described by Neidle et al. and Thompson et al. could also (or perhaps alternatively) be explained by use of constructed action via eye gaze. However, Neidle et al. do not consider constructed action as a possible explanation for the patterns they describe for agreement verbs and as noted above, and Thompson et al. (2006) appear to treat eye gaze as unrelated to constructed action. Likewise, Herrmann and Steinbach (2012) do not consider correlations (in terms of co-occurrence) between verb modification and role shift at all. Thus, it is very difficult to know if these studies on ASL and DGS are describing the same or different phenomena from those in BSL and Auslan.11

Under agreement analyses, spatial modification of indicating verbs, shifts in eye gaze towards locations associated with referents, and enactment via role shift are considered as independent phenomena, and there have been no attempts to explain why these all occur so often together. The indicating verb construction analysis we propose, instead, provides a unified way to account for verb modification and co-occurrence with constructed action including shifts in eye gaze: indicating verbs point to real or imagined referents and in doing so often use constructed action to directly show this. Similar construction analyses could be applied to the coordination of speech, eye gaze, and pointing gestures in hearing non-signers (Kita 2003b; Sidnell 2006). Again, we stress, however, that there is no real analogy for indicating verbs in multimodal speech and gesture as it is only in sign languages where morphemic and deictic elements occur in the same modality, and verbs may themselves be modified spatially to reflect associations with present and absent referents. It is this unimodal nature of these constructions that make indicating verbs so typologically unique.

7 Conclusion

In this paper, we have presented arguments for an analysis of indicating verbs, building on Liddell (2000), as a typologically unique, unimodal fusion of morphemes and pointing gestures functioning as a construction that is used for reference tracking. We have explained how certain patterns that have been observed in indicating verbs appear to align with a growing body of research in co-speech pointing gestures and on multimodal speech/gesture constructions. The similarities between gesture and indicating verb systems result in one of the key ways in which indicating verb system does not resemble agreement marking – i.e., the way they exploit space for deictic reference does not always result in the systematic covariance normally associated with agreement systems (Corbett 2006). We have demonstrated how this commonality fits with what we know from a range of sources – from recent corpus studies, to acquisition, grammaticalisation, and sociolinguistic variation and language change. We have also shown how our indicating verb construction analysis provides a unified way of accounting for relationships found between verb modification, eye gaze and enactment in ways unaccounted for by agreement analyses. This Construction Grammar account also obviates the need for rule-based explanations of why only a subset of verbs in sign languages are indicating verbs, the optionality of directionality, and the problem of backwards verbs. Adopting a constructionist analysis is also advantageous since this commonality between co-speech gesture and indicating verbs systems can be captured in sign languages and in multimodal communication involving spoken languages. More detailed comparisons with multimodal descriptions of language use are required to better understand these patterns.