How do the signs of sign language differ from the gestures that speakers produce when they talk? Although signs and gestures are produced in the same modality and can look superficially similar, they have been given different theoretical treatments. Since the 1960s, sign languages have typically been described within theoretical frameworks used to describe spoken languages. In contrast, gesture—and, in particular, co-speech gesture—has traditionally been viewed as external to language (Kendon 2004; Goldin-Meadow & Brentari 2017). Since theories about gesture and sign have developed independently along these diverging lines, it is not surprising that these two forms of communication are treated as fundamentally different. However, few studies to date have directly compared the two to determine exactly how they differ.
We investigate the similarities and differences in pointing signs and pointing gestures with respect to form. We focus on pointing for a few reasons. First, pointing is ubiquitous in both spoken and signed communication and, importantly, is used in both cases for the same broad function of drawing attention to locations or entities; pointing thus presents a critical opportunity to examine how communicative forms differ when used within a sign language versus a spoken language system. Second, although pointing has been described as being part of the pronominal system in sign languages (as well as having other functions) (e.g., Klima & Bellugi 1979; Padden 1983; Lillo-Martin & Klima 1990; Meier 1990; Sandler & Lillo-Martin 2006; Meier & Lillo-Martin 2010), several authors have noted that, to fully evaluate this claim, comparable data on pointing gestures in hearing speakers is critically needed (e.g., Cormier et al. 2013). In this paper, we compare 574 pointing signs produced by British Sign Language (BSL) signers, all of which are assumed to have a pronominal function, to 543 comparable pointing gestures produced by American English speakers. Our aim is to determine how—and to what degree—points in sign language and points in gesture differ in form.
2 Literature review
Our investigation focuses narrowly on pointing to entities—people or things—and excludes points to locations (i.e., locative points) as well as pointing signs functioning as determiners. We distinguish three types of points to entities: self-points (i.e., typically a point to one’s own chest), addressee-points (i.e., typically a point to the person with whom one is communicating) and other-entity points (i.e., a point to some other entity, whether present or non-present). Although these categories roughly correspond to what are typically described as first, second, and third person pronominal points within the sign language literature, we adopt the more theoretically neutral terms self-points, addressee-points, and other-entity points. Throughout the paper, we use the term pointing signs to refer to points in sign languages that are assumed to have a pronominal function, and pointing gestures to refer to comparable points in gesture.
2.1 Why compare points in sign language and gesture?
First, why is it illuminating to compare pointing signs and pointing gestures? Although pointing is widespread in spoken and signed communication and functions to draw attention to entities in both, the two are usually described in strikingly different terms. The pointing signs that we focus on are described as being part of a linguistic system, in particular, a pronominal system, and have been argued to serve many of the functions that pronouns in spoken languages serve (e.g., Meier & Lillo-Martin 2010). Pointing gestures, in contrast, have not been subject to such linguistic analysis, and are typically described as instances of non-verbal communication that can be used concurrently with, or instead of, speech (Kita 2003; Kendon 2004). Nevertheless, points produced by signers and speakers are similar in form (see Figure 1). This similarity is, of course, no accident. Not only are gesturers and signers subject to the same bodily resources and constraints, but they also appear to be subject to the same cultural conventions. For example, in western cultures, pointing to the self is typically performed by pointing to the chest in both co-speech gesture and in western sign languages such as American Sign Language (ASL) and British Sign Language (BSL). In contrast, in Japan, pointing to the self is often performed by pointing to the nose in both gesture and Japanese Sign Language (McBurney 2002). Another reason points in gesture and sign may be similar is that pointing gestures are also a likely source of input for the first generation of signers developing a new sign language (Coppola & Senghas 2010).
The formational similarities between pointing signs and pointing gestures have not gone unnoticed by sign language researchers. There are, nonetheless, different opinions as to the extent to which pointing signs constitute a separate category from pointing gestures. Some sign linguists (e.g., Klima & Bellugi 1979; Padden 1983; Lillo-Martin & Klima 1990; Meier 1990; Meier & Lillo-Martin 2010) argue, with reference to formational and distributional criteria, that pointing signs function like pronominals in spoken language. Other sign linguists (e.g., Cormier et al. 2013) note that pointing signs share properties not only with spoken language pronouns, but also with pointing gestures, which blurs the distinction between the two communicative forms (see also Johnston 2013). What is notable, however, is that these positions have been argued in the absence of much empirical comparative data. The few empirical studies that have looked at other aspects of points in signers and speakers suggest differences on both functional and formational grounds. For example, Zwets (2014) investigated the functions of pointing signs produced by signers of Sign Language of the Netherlands (NGT) vs. pointing gestures produced by speakers of Dutch, and found that signers were more likely to use arbitrary locations in space to refer to non-present referents than speakers. Mesh (2017) compared locative points produced by signers and speakers in San Juan Quiahije Chatino and found that the two groups differed in their choice of handshape for distant targets. Nevertheless, to our knowledge, there are no studies that directly align pointing signs assumed to be functioning as pronominals with comparable pointing gestures in order to characterize and quantify the formational differences between the two. Instead, formational similarities observed between pointing signs and pointing gestures have typically been based on coarse-grained descriptions (e.g., descriptions made across studies or based on cursory observation) and sometimes draw solely on intuitions. This gap motivates our study––a fine-grained, direct comparison of spontaneous pointing in signers and speakers. We ask: Which formational features do pointing signs and pointing gestures share, and which features, if any, distinguish the two communicative forms? Here we consider three types of distinctions that have been proposed to mark the transition from gesture to sign: conventionalization, reduction, and integration.
A first potential change that communicative forms undergo over time is conventionalization—that is, the use of a consistent form (i.e., showing less variation in formational features across uses) for a given meaning. Conventionalization of form has been considered an important developmental marker in studies looking at the emergence of sign languages such as Al-Sayyid Bedouin Sign Language (ABSL) (e.g., Sandler et al. 2011). Studies comparing sign language with silent gestures (i.e., gestures produced by non-signers asked to communicate using only their hands and not their mouths, e.g., Goldin-Meadow et al. 1996) have also focused on conventionalization. In Brentari et al. (2012), the handshapes produced by signers of ASL or Lingua dei Segni Italiana in an elicitation task was limited to a specific and relatively small set. In contrast, hearing non-signers performing the same task using silent gesture exhibited significantly more variation in handshape. The authors argue that the variation across signers is grounded within a morphological and phonological system, and is thus more limited than the variation across silent gesturers, which is not similarly constrained. These studies suggest that the signs of a sign language may be more conventionalized than the spontaneous gestures used by hearing speakers. We might therefore expect pointing signs to exhibit more formational consistency across uses and users than pointing gestures.
A second change that communicative forms undergo over time is reduction. Reduction is seen in grammaticalization—in, for example, the transition from content word to function word (Hopper & Traugott 2003). While grammaticalization is considered to be modality independent, the process in sign languages may differ from the process in spoken language since grammaticalization does not always begin with a lexical sign, but may begin with a gesture used by the surrounding speech community (Janzen & Shaffer 2002; Pfau & Steinbach 2006). The grammaticalization of pointing signs from pointing gestures has been proposed as an example of this pathway. Pfau & Steinbach (2006) suggest that, over time, pointing gestures become locative signs, then demonstrative pronominal signs, and finally personal pronominal signs. Indeed, locative pointing signs appear to occur before pronominal pointing signs in an emerging language, Nicaraguan Sign Language (NSL) (Coppola & Senghas 2010), and pronominal points in NSL, the more grammaticalized forms, are articulated more quickly and with less movement than locative points, the less grammaticalized forms. Extrapolating from these findings, we hypothesize that pointing signs may be more reduced than pointing gestures simply because they have been grammaticalized.
The third change that communicative forms can undergo is that they become integrated with other aspects of the language, such as prosodic structure. Sandler et al. (2011b) and Sandler (2012) describe how prosodic structure develops alongside grammatical structure with each generation of ABSL, a sign language emerging in Israel. Over time, prosodic cues effectively group together signs at a syntactic level (e.g., noun phrases are prosodically associated with their predicates) and these cues are more salient in later generations. The fact that prosodic organization (specifically, timing cues) is closely entwined with syntactic structure is important to our study. Pointing signs serving as pronouns indicate arguments within a clause and therefore display a maximal level of syntactic integration (de Vos 2015). If pointing signs are fully integrated within the grammar of sign language (i.e., if they appear in specific sequential slots), then we hypothesize that they should exhibit prosodic characteristics generally found in sign languages. For example, signs appearing at the ends of intonational phrases tend to be longer in duration than signs appearing in other positions (Nespor & Sandler 1999; Wilbur 1999). Pointing signs might then display this lengthening when they appear in phrase-final position. It is not clear, however, whether pointing gestures will display this feature.
In the present study, we investigate whether pointing signs are formationally distinct from pointing gestures and focus, in particular, on whether pointing signs are more conventionalized, more reduced, and more integrated into prosodic structure than pointing gestures. To do so, we analyze the formational features identified by Johnston (2013) in his description of pointing signs in Auslan: handshape, hand use (i.e., whether one or two hands are used and, for one-handed points, whether the dominant hand is used), duration, and (for self-points) contact with the chest.
2.2 Formational features of pointing signs and pointing gestures
In this section, we review work on the particular formational features that are of interest in this study.
Pointing signs in sign languages are widely reported to have conventionalized handshapes. For personal pronominal reference, sign languages such as ASL and BSL typically use the index finger handshape (Sutton-Spence & Woll 1999; Sandler & Lillo-Martin 2006). This handshape contrasts with handshapes used for possessive pronominal reference, a B-handshape in ASL and a closed fist handshape in BSL. Variation in handshape has also been documented within pronominal signs. Using a large ASL corpus, Bayley et al. (2002) found that first person points varied more in handshape (81% did not use an index finger handshape), than second person points (34%) and third person points (45%). Similar findings have been reported for BSL (Fenlon et al. 2013).
The index-finger handshape is also the most common form in pointing gestures, at least in Anglophone and European cultures (Cochet & Vauclair 2014; Cooperrider et al. 2018; Flack et al. 2018). Interestingly, however, this form preference is most often attributed to biomechanics in gesture (Povinelli & Davis 1994; Liszkowski et al. 2012), rather than to convention or linguistic function, the causes most often cited in sign languages. But culture-specific ways of pointing have also been described, including using particular handshapes for specific discourse functions (Kendon & Versante 2003; Wilkins 2003) and particular conventions for non-manual pointing (Enfield 2001; Cooperrider & Núñez 2012).
2.2.2 One hand vs. two hands
The citation form for pointing signs is typically one-handed, although two-handed points have been reported in some sign languages (e.g., Johnston 2013). Similarly, although the canonical pointing gesture is one-handed, both one-handed and two-handed points are attested in pointing gestures to the self (Cooperrider 2014).
2.2.3 Hand preference
The preference for one hand over the other is well documented in the sign language literature, with researchers making a distinction between the dominant hand (i.e., the hand more frequently used) and the subordinate hand (Johnston & Schembri 2007). Dominance is typically determined by whether a signer is left-handed or right-handed (Johnston & Schembri 2007)––right-handed signers tend to produce one-handed signs with their right hand; left-handed signers tend to produce one-handed signs with their left hand (but see Sáfár et al. 2010). This preference has also been demonstrated for pointing signs in Auslan where approximately 90% of points were articulated with the dominant hand (Johnston 2013).
Many studies of gesture report a preference for the dominant hand for gesture generally (e.g., Kimura 1972; Kimura 1973; Kita et al. 2007) and studies looking specifically at pointing in relation to handedness report a moderate correlation between the two (Cochet & Vauclair 2012). However, developmental studies have also reported an association between rate of language development and preference for the right hand for pointing in toddlers (Vauclair & Cochet 2013).
Another formational feature of interest is duration. Börstell et al. (2016) have found pointing signs to be shorter in duration than lexical signs in Swedish Sign Language (SSL). They suggest that this difference may be due to frequency; pointing signs are generally the most frequent signs in conversations, and more frequent signs tend to be shorter in duration than less frequent signs. In addition, Johnston (2013) found that self-points tend to be shorter than other types of pointing signs in Auslan.
One aspect of pointing duration that has not been considered in these earlier studies is whether duration is affected by phrase position. As described earlier, signs in phrase-final position are generally longer than signs in other positions (Nespor & Sandler 1999; Wilbur 1999). It is not clear whether this effect holds for pointing signs, particularly since points often occur in prosodically weak positions within a phrase (Nespor & Sandler 1999).
Comparable data from gesture is not available, but microanalytic descriptions of pointing gestures indicate considerable variability in their length (Kendon 2004). A point may be articulated with a word, a phrase, a clause, or larger constituents at the discourse level, but we do not yet understand which aspects of the speech that accompanies a pointing gesture (if any) determine the length of that gesture.
A final formational feature of interest is whether, in a self-point, the hand contacts the chest. Self-points are often argued to be distinct from other types of points in sign languages because they are the only type of point to make contact with a place of articulation (i.e., the signer’s chest) (e.g., Engberg-Pedersen 1993). However, based on a random sample of 50 self-points in the Auslan Corpus, Johnston (2013) found that 10% did not contact the chest, suggesting that contact is not essential in self-points in sign. The feature contact has been used to argue for a person distinction in sign languages such as Danish Sign Language (DSL) (Engberg-Pedersen 1993). Since self-points are the only points that make contact with something, they are formationally distinct from other pointing types and this, in turn, suggests a distinction between first and non-first pronominal points in DSL. McBurney (2002), however, argues that this difference can be explained with reference to phonological well-formedness (i.e., signs articulated at the chest typically make contact at the chest) rather than being indicative of person-marking.
There is a paucity of detailed quantitative data on self-points in gesture, although it has been noted that points to self sometimes do, and sometimes do not, make contact with the chest (Kendon 2010).
In summary, the goal of our study is to characterize and quantify several formational features of pointing signs vs. pointing gestures: handshape, hand use, duration, and contact (for self points only). We analyze whatever formational differences we find between pointing signs and pointing gestures in terms of the three changes that have been described in evolving languages: conventionalization, reduction, and integration. We use these data to provide insight into the status of pointing in sign languages.
We examined points from two datasets: the BSL Corpus (Schembri et al. 2014) and the Tavis Smiley Corpus (Cooperrider 2014). In this section, we describe both datasets, explain the process by which pointing tokens were identified, and outline our annotation scheme.
3.1 Two corpora
3.1.1 BSL Corpus
For the sign data, we focused on the points produced by 24 BSL signers (13 men, 11 women) from London, produced within the conversation component of the BSL Corpus (Schembri et al. 2014). All participants reported using BSL as their first and preferred language and having learned the language before the age of seven. Participants were filmed in pairs seated next to one another in a studio setting (with three cameras in front of a blue screen); for the conversational component of the corpus, participants were left alone with the camera recording so that they were not directly observed by the researchers. Participants were asked to talk about whatever topic they wanted. To ensure that the data were as naturalistic as possible, participants were paired with individuals of a similar age whom they knew well. A screenshot of the studio set-up is provided in Figure 2.
3.1.2 Tavis Smiley Corpus
For the gesture data, we focused on the points produced by 27 speakers of American English (14 men, 13 women) who participated in interviews that were later broadcast on the Tavis Smiley Show, a television program (Cooperrider 2014). Each interview involved two people seated next to one another in a studio setting. Importantly, although the arrangement appears more formal than the BSL Corpus, the interview was conducted as a casual conversation, with no notes or prompts visible. Moreover, no studio audience was present. One participant, the interviewer Tavis Smiley, is present in all interviews (he is one of the 14 men); the remaining 26 participants were the interviewees. The set-up is the same across interviews, with Tavis Smiley always seated across from, and to the left of, the interviewee (see Figure 2).
In examining these datasets, not only are we comparing speakers to signers, but we are also comparing American to British participants. However, we have no a priori reason to suspect differences between these two cultural groups. For example, recent studies report a preponderance of index-finger pointing in both cultures (Cooperrider et al. 2018; Flack et al. 2018), and corpus data from BSL (Fenlon et al. 2013) and ASL (Bayley et al. 2002) reveal very similar levels of handshape variation in pointing signs in the two languages. Thus, although subtle differences in pointing between American and British participants cannot be ruled out, the broad cultural similarities and existing evidence suggest that the comparison is worthwhile.
3.2 Identifying pointing signs and gestures
In total, we examined 574 points from the BSL Corpus, corresponding to the categories of first person, second person, and third person singular pronouns. Since pointing signs were previously identified as part of a lexical frequency study (Fenlon et al. 2014), selecting signs for inclusion in the study involved searching these annotations for the glosses of PT:PRO1SG, PT:PRO2SG, and PT:PRO3SG.1 We did not include pointing signs that functioned as determiners (coded as PT.DET).2 The pointing signs we did include were previously identified in the corpus on the basis of form (i.e., a movement, usually articulated with an index finger, indicating a particular location in space) and meaning (i.e., the token in question indexed a first, second, or third person referent and functioned as a complete noun phrase within an utterance). As pointing signs were the most frequent type of signs in casual conversation, we limited our selection to the first 10 tokens of each pointing sign type for each participant, giving us a possible maximum of 30 tokens per participant. However, some participants did not produce 10 tokens of each type resulting in fewer than the expected 720 tokens. Examples of 1st person, 2nd person, and 3rd points in the BSL corpus are provided in Figures 3 and 4. Note that a point to 1st person and 2nd person is directed at a present entity (i.e., either the signer or the addressee); however, points to a 3rd person referent in the BSL corpus were typically directed at a non-present entity (to an imagined referent or to an arbitrary locus).
In total, 543 points were examined from the Tavis Smiley Corpus. In contrast to our approach with the BSL Corpus, we coded every pointing gesture that occurred for 27 randomly selected speakers (out of 40 total), yielding a dataset of pointing gestures that was comparable in number to our dataset of pointing signs. For the pointing gestures, we defined a point as a movement toward a region in space with the apparent intention of directing the listener’s attention to that region (Cooperrider et al. 2018). To maximize comparability with the BSL dataset, we focused only on pointing gestures to self, addressee, or some other entity (again, this was typically a non-present entity either in an imagined or an arbitrary locus),3 corresponding to first, second, and third person pronominals in BSL. To meet this criterion, we identified, in each case, the lexical affiliate (i.e., the spoken word associated with the meaning of the pointing gesture) to confirm the point’s function. Such affiliates included possessive pronouns (e.g., my book, your last show), noun phrases (e.g., Bette Midler’s suit, a doctor, a man) and adverbials (e.g., last summer, a few weeks ago). Pointing gestures were only included if they were affiliated with a corresponding personal pronoun or noun phrase. Examples of each type of pointing gesture are provided in Figures 5, 6, 7 (the lexical affiliate in each case is indicated in bold).
Overall, in our BSL dataset, we analysed a total of 574 pointing signs: 238 self-points, 112 addressee-points, and 224 other-entity points. In our Tavis Smiley dataset, we analysed a total of 543 pointing gestures: 137 self-points, 158 addressee-points, and 248 other-entity points.
3.3 Further annotation for formational features
Using ELAN4 (Wittenburg et al. 2006), we annotated each token for the following formational features: handshape, hand use (the number of hands used and, for one-handed points, the choice of a right or left hand), duration, and contact.
For handshape, points were coded as belonging to either the 1-handshape or the B-handshape family. Tokens belonging to the 1-handshape family typically featured a handshape with the index finger extended and all other fingers closed or partially closed; in addition, the thumb may also be closed or fully extended. For the B-handshape family, all four fingers are extended and the thumb may be unopposed or opposed to some degree. Several examples of 1-handshape and B-handshapes are provided in Figure 8.
For hand use, tiers corresponding to the right hand and left hand were created, and points were annotated on these tiers depending on the hand used. Two-handed points were those in which both hands indicated the same referent in tandem (see Figure 9). In addition to observing hand preference patterns in pointing, we referred to the signers’ reported hand preference in the participant metadata collected as part of the BSL Corpus Project. For the gesturers, handedness was determined by searching the internet for pictures or videos of each interviewee signing autographs. Using this method, we were able to determine handedness for all but four of the gesturers.
For duration, we identified the first frame where the hand visibly started moving towards articulating a point. This starting point may occur when the hands are at rest (e.g., in the lap) and beginning to move towards a point, or when the hands have just completed a prior sign/gesture. In the latter case, the first frame in which the hand begins to move towards the next sign (signaled either by a change in path movement or a change in handshape) was taken to be the starting point. The end of a point was considered to be the frame immediately prior to the frame in which the hand moves to articulate the next sign/gesture, or begins to return to rest. Again, the end point could be signaled by a change in path movement or a change in handshape.
For contact, all self-points were analyzed for whether they made contact with the participant’s chest or not (see Figure 10).
We also coded whether points occurred in final or non-final position in an utterance. Utterances in BSL were identified with reference to prosodic and syntactic structure (following Sandler et al. 2005). That is, we used meaning as a guide to group signs together (i.e., identifying a verb and its arguments) and also referred to prosodic cues, such as a change in facial expression and pauses, to further justify how signs were grouped together. Similarly, we used meaning and prosody to identify whether pointing gestures occupied a final or non-final position in the spoken utterance. The addressee point in Figure 4 and the self-point in Figure 5 were coded as final according to these criteria, whereas the other points in Figures 3, 6 and 7 were coded as non-final.
All the annotation work was completed by the first author and a research assistant. Approximately 13% of the overall data (balanced between the BSL and Tavis Smiley data) was checked by a second researcher, resulting in an agreement level of 94%. This process involved checking all the identified points to determine whether they had been classified appropriately (e.g., given the correct label for handshape).
In each case, the formational features of pointing signs and pointing gestures were compared using a hierarchical (also known as a mixed effects) regression model. The statistical models were fit using R (R Core Team 2016) and the lme4 package (Bates et al. 2015). Calculating p-values for hierarchical linear regressions is not straightforward since it is not clear how degrees of freedom should be calculated. Instead, following Gelman & Tuerlinckx (2000), we use 95% confidence intervals to determine the direction and magnitude of the effect within each model prediction. Additionally, bootstrap sampling was used to generate confidence intervals for each subject, providing a measure of how accurately the model can predict the formational features of the points produced by each individual. Each model reports the likelihood of the occurrence of a categorical outcome (e.g., whether a B- or 1-handshape was used), or the specific value of a continuous outcome (e.g., point duration), and most are presented with the predictors (sometimes known as fixed effects) of group (signers or gesturers) and point type (self, addressee, other-entity). In each case, we also tested for an interaction between the two predictors. Furthermore, we included intercept adjustments (sometimes known as random effects), as well as slope adjustments (also sometimes known as random effects); both intercept and slope adjustments were by subject.5 In the text of this article, to maximize clarity and approachability, we use estimates from the model in their natural interpretation space (probabilities, untransformed milliseconds, etc.) along with confidence intervals, rather than coefficient estimates and standard errors. The full model output (including additional statistical information such as coefficients and standard errors) is provided in the appendix.
Signers generally preferred 1-handshapes for pointing signs (Figure 11). However, handshape interacted with pointing type: 1-handshapes were used less for self-points than for addressee-points and other-entity points. There was a 65.2% probability of a 1-handshape with self-points (95% CI: 53.1–75.7%),6 99.8% probability of a 1-handshape with addressee points (95% CI: 90–100%), and 95.2% probability of a 1-handshape with other-entity points (95% CI: 90.0–97.7%). Overall, the gesturers differed significantly from signers in their choice of handshape (i.e., the mean confidence intervals do not overlap), with gesturers preferring B-handshapes overall. However, as in the signers, the handshape gesturers used for pointing interacted with pointing type, with 1-handshapes more strongly dispreferred in self-points. There was a 6.6% probability of a 1-handshape with self-points (95% CI: 2.7–15.7%), 12.0% probability of a 1-handshape with addressee points (95% CI: 1.5–54.4%) and 33.5% probability of a 1-handshape with other-entity points (95% CI: 21.9–47.4%). Importantly, the signers’ addressee and other-entity points were the only groups of points that were near categorical: they appeared in the 1-handshape form almost all of the time (the range of individual subject predictions for probability of a 1-handshape for addressee points was 94.1–99.8% and, for other-entity points, 91.6–97.2%).7 The gesturers’ self-points were also near categorical, but took on a different form: they appeared in the B-handshape form almost all of the time (the range of individual predictions for probability of a 1-handshape was 5.1–18.6%, equivalent to a likelihood between 81.4–94.9% of a B-handshape).
4.2 One hand vs. two hands
We next analyzed how often one vs. two hands were used to point (Figure 12). Signers rarely used two-handed points (0.7–3.5% of points are two-handed, depending on category). This finding was consistent across the three pointing types. For the gesturers, however, we not only found that two-handed points were more likely than for signers, but we also found a stronger tendency for two-handed points to occur in self-points (48.3%, 95% CI: 26.0–71.3%) than in addressee points (22.4%, 95% CI: 12.3–37.3%) or other-entity points (8.4%, 95% CI: 2.6–24.1%). There was also more individual variation in the gesturers than in the signers: for self-points in gesturers, the subject predictions (the dots in the figure) ranged from 13.1% to 88.5% probability of being two-handed, whereas the comparable figure for signers was 0.3%–7.3%. Gesturers’ points to the addressee and some other entity also showed greater ranges (addressee: 11.8%–37.2% and other-entity: 1.9%–59.8%) when compared to the signers’ (addressee: 2.4%–8.7% and other-entity: 0.7%–33.5%).
4.3 Right hand vs. left hand
We next analyzed the preference for the right or left hand for one-handed points, as a function of participants’ handedness (Figure 13). Within the signers, there were 21 right-dominant and 3 left-dominant participants; within those gesturers whose handedness could be determined, there were 19 right-dominant and 4 left-dominant participants. Right-handed signers were more reliant on their dominant hand (3.42% probability of using a left-hand point, 95% CI: 1.20–9.39) than were right-handed gesturers (39.67%, 95% CI: 18.60–65.42). In contrast, left-handed signers (89.29%, 95%: 37.84–99.13) and left-handed gesturers (86.85%, 95% CI: 34.16–98.82) relied on their dominant hand to the same extent (although there are relatively few data points). Further, there was again more individual variation within the gesturers than the signers: some gesturers used their right hands exclusively whereas others used their left, and many used a combination of both.
We next analyzed the duration of signers’ and gesturers’ points, for each of the three types of points (Figure 14). Pointing signs were consistently shorter than pointing gestures, a pattern that was robust across types: self points points were 245 msec in duration for the signers (95% CI: 217–276 msec), compared to 865 msec for the gesturers (95% CI: 740–1009 msec). Addressee points were 228 msecs for the signers (95% CI: 196–264 msec), and 752 msecs for the gesturers (95% CI: 630–897 msec). Finally, other-entity points were 262 msecs for the signers (95% CI: 235–292 msec), and 790 msecs for the gesturers (95% CI: 700–891 msec). Although other-entity points appeared to be slightly longer than the other two types for both signers and gesturers, there was no significant difference in duration between self, addressee, and other-entity points within each group (i.e., there was considerable overlap in the confidence intervals for each type of point). Finally, there was generally very little individual variation in duration within each group: individual signers ranged from 217–281 msec for self-points, 212–242 msec for addressee-points, and 250–279 msec for other-entity points; individual gesturers ranged from 747–925 msec for self-points, 728–858 msec for addressee-points, and 725–896 msec for other-entity points.
Next, to look at the effect of clause position, we fitted a hierarchical linear regression model similar to the duration model just described. We added a predictor for clause position, but removed the pointing type predictor (we did not have a sufficient number of points for each pointing type in each clause position to have confidence in a larger model that included both predictors). Figure 15 reveals that pointing signs in non-final position were significantly shorter than pointing signs in final position. On average, non-final pointing signs were 213 msec (95% CI: 197–232 msec), compared to 416 msec (95% CI: 356–486 msec) for points in final position. In contrast, there was no difference between final and non-final pointing gestures. On average, non-final pointing gestures were 798 msec (95% CI: 727–876 msec), compared to 807 msec (95% CI: 646–1009 msec) for points in final position. Individual signers ranged from 205–223 msec for non-final and 365–467 msec for final points; individual gesturers showed a wider range, from 751–902 msec for non-final and 749–819 msec for final points.
4.5 Contact (self-points only)
Finally, we analyzed whether self-points contacted the chest, the probability of contact with the body, and included group (signers or gesturers) as the only predictor (Figure 16). Overall, signers’ self-points were more likely to contact the chest (91%, 95% CI: 84.8–95%) than gesturers’ self-points (15%, 95% CI: 7.9–26%). There was remarkably little individual variation in either group, as evidenced by the narrow distribution across subjects within groups: individual signers ranged from 89.8–91.5% probability of contact; individuals gesturers ranged from 15.0–18.0% probability of contact.
Pointing signs are typically considered to be part of a linguistic system, whereas pointing gestures are usually considered to be external to such a system. These differing treatments have been put forward despite pointing signs’ and gestures’ apparent superficial similarities and, in particular, without many empirical studies directly comparing spontaneously produced pointing signs and pointing gestures along the same formational dimensions. Our study focused on a set of general characteristics typically associated with linguistic systems. If pointing signs are part of a linguistic system, we might predict them to be more consistent in form from one use to the next (conventionalization), more reduced (reduction), and more integrated with other aspects of the language system (integration), than pointing gestures, which often accompany pronouns but are not themselves pronouns. Overall, our findings are consistent with these predictions. We discuss this evidence in the next sections, along with alternative interpretations.
5.1 Evidence for conventionalization
As languages mature, the forms used to express a given meaning typically become more consistent—a communicative and social process known as conventionalization. In other words, the form used for a specific meaning becomes more stable and is less likely to vary from one use to the next. Evidence for this process has been found in studies of sign language emergence (e.g., Sandler et al. 2011a), and comparisons of sign language with the silent gestures produced by speakers (e.g., Brentari et al. 2012). Given this evidence, we expected that pointing signs would be more consistent in form than pointing gestures. In line with this prediction, we found more consistency in pointing signs relative to pointing gestures along several formational dimensions. First, although each group showed an overall preference for a particular handshape—1-family for signers, B-family for gesturers—the reliance on this preferred family was slightly stronger among the signers. Moreover, with respect to hand use, signers consistently preferred to use one hand for all pointing types; gesturers were less consistent overall and used more two-handed pointing gestures, particularly for self-points. Finally, signers are known to favor their dominant hand when signing (Johnston & Schembri 2007). Our data not only show that this preference extends to pointing signs, but also that signers favor their dominant hand when pointing more strongly than do gesturers. Together these findings are consistent with the conclusion that variation in form amongst signers is grounded within a linguistic system (in this case, a phonological system), whereas variation within gesturers is not similarly constrained.
The overarching trend toward increased formational consistency in pointing signs is robust, but conventionalization may not always be the driving force toward consistency. The term “conventionalization” implies a social and communicative process whereby certain forms become more stabilized through use. However, other forces can lead to consistency, including biomechanical (e.g., articulatory ease) and pragmatic (e.g., repeated mention across a discourse) pressures. Indeed, some of the commonalities that we observed between signers and gesturers might be explained by these forces, rather than by parallel conventionalization. For example, when pointing to the self, both signers and gesturers show a stronger preference for B-handshapes than when pointing to the addressee or to other entities.8 The B-handshape directs attention to a referent less precisely than the 1-handshape––it indicates the direction in which the referent can be found, but does not focus the listener’s line of regard precisely on the referent (Iverson & Goldin-Meadow 1998; Iverson & Goldin-Meadow 2001; Kendon 2004). Both signers and speakers may thus use the less precise B-handshape in self-points rather than a 1-handshape point simply because points to the self are not in danger of being misinterpreted. In contrast, when pointing to third parties, it may be important for both signers and gesturers to specify the location of the referent more precisely because these locations may be shared with other entities (see Bayley et al. 2002; Fenlon et al. 2013). In other cases, observed consistency in a formational feature may actually have different sources in each group. For example, both signers and gestures are highly consistent with respect to contact, although with opposite tendencies: while signers’ self-points consistently contact the chest, gesturers’ self-points consistently avoid contact. One possibility is that this consistency is driven by contrasting conventions in both groups. An alternative is that the high consistency with regard to contact may be driven by conventionalization in signers, but by articulatory constraints (that are currently not well understood) in gesturers.
Importantly, we are not suggesting that regularities in gesture are never due to conventionalization. Emblems are the most obvious example of conventionalization in gesture, but other gestures exhibit a degree of conventionalization too (Kendon 2004). In the case of pointing, some speaking communities have particular conventions for pointing with the face (Enfield 2001; Cooperrider & Núñez 2012), some associate particular handshapes with particular functions (Wilkins 2003), and some observe strict taboos about pointing (e.g., Kita & Essegbey 2001). Conventionalization may thus partly explain some of the patterns we observed in the present study. For instance, the preference we observed for the B-handshape in gesturers (which has not been found in studies focusing on points to locations and objects, e.g., Cooperrider et al. 2018) may reflect a convention whereby the B-handshape is considered more polite, especially in points to persons (Calbris 1990). This politeness convention has also been observed in sign languages (see Berenz 2002). However, the more formal interview setting in the gesture data may have led gesturers to follow this politeness convention more closely than they would have in a less formal setting (i.e., this difference could be attributed to register). Consistency in gesture may thus be driven by conventionalization just as consistency in signing is, although the conventionalization processes that operate in gesture are not yet well understood.
5.2 Evidence for reduction
As the forms of a language—whether spoken or signed—become more like words, and later, more like function words, they undergo reduction—that is, they lose phonetic material (Hopper & Traugott 2003). Evidence for this general process is widely attested in spoken language and in language creation experiments involving gesture (Namboodiripad et al. 2016). Overall, our results indicate that pointing signs are more reduced when compared to pointing gestures, along several formational parameters. For one, pointing signs are markedly shorter in duration than pointing gestures. This finding is robust across individuals within each group and for each pointing type, and is reminiscent of a widely attested pattern in spoken languages (Zipf 1949; Wright 1979), and even in animal communication (Ferrer-i-Cancho et al. 2013), where the more frequent a communicative form is, the shorter it tends to be. Such patterns are thought to stem from a drive to conserve bodily effort in communication (Zipf 1949). The shorter duration of pointing signs may be related either to their frequency, since pointing signs are generally amongst the most frequent signs in BSL conversations (Fenlon et al. 2014), or to their grammatical status. This relationship between frequency and duration has been demonstrated in the SSL corpus overall, where high frequency signs (not just pointing signs) are generally shorter in duration than low frequency signs (Börstell et al. 2016). Indeed, grammatical signs (e.g., pronouns, question signs, conjunctions), which are highly frequent, are shorter than the less frequent content signs (e.g., nouns, verbs, adjectives) (Börstell et al. 2016). Pointing signs thus display formational features usually associated with grammatical signs in sign languages.
Pointing signs were also more reduced along another formational parameter: hand use. Signers generally preferred one-handed points across the board, which take less effort to produce than two-handed points. Gesturers, however, showed a greater tendency to use two-handed pointing gestures. This tendency toward one-handed pointing signs—like the tendency toward shorter pointing signs—is likely related to the drive to conserve effort (Napoli et al. 2014). One interesting question is why conservation of effort seems to impact sign more than gesture. In selecting one or two hands, both signers and gestures may be subject to competing constraints, but the interplay of those constraints is a matter for future research.
The overarching trend for pointing signs to be more reduced than pointing gestures is thus consistent with the possibility that pointing signs are more grammaticalized than pointing gestures. However, there is an important alternative to consider. In signed communication, points are produced in the same articulatory channel as the primary referential content (i.e., by using the hands). However, in spoken communication, points are produced with the hands, but the primary referential content is produced in another channel: speech. As a consequence, in sign, there is not much “space” for points to be held because they must slot into the same linear string as other signs. But, in gesture, there is usually no more than one gesture per intonation unit (McNeill 1992), and points are thus free to span over this unit or even extend across units.
5.3 Evidence for integration
A third change that communicative forms undergo as they become more linguistic is integration. That is, they can become more integrated with the structures of the broader language system, including prosodic structure. Pointing signs—and, in particular, those points that function as personal pronouns—are often described as being maximally integrated within sign languages (e.g., de Vos 2015), and can function as arguments of a predicate. Since prosodic and syntactic structure are closely connected, we might expect that pointing signs will exhibit prosodic characteristics that are typical of signs more generally. Our test of this possibility was to see whether pointing signs exhibit the pattern of lengthening at the end of the utterance, which has been observed for signs generally (Nespor & Sandler 1999; Wilbur 1999). Our results indicate that pointing signs in BSL do indeed vary according to their position within an utterance, with pointing signs in final position being significantly longer than pointing signs in non-final position. Importantly, this pattern is not observed in the gesturers. The pattern is particularly interesting given that spoken utterances, like signed utterances, exhibit phrase-final lengthening (Cruttenden 1995).
Note, however, that gestures can be integrated with speech in a number of ways. For example, beat and representational gestures tend to co-occur within a constrained time window associated with the pitch-accented syllable in a phrase (Krahmer & Swerts 2007; Loehr 2007; Brentari et al. 2013). But pointing gestures may not be as closely tied to speech as other types of gestures. Studies eliciting points in controlled settings suggest that points are closely aligned with speech (Levelt et al. 1985; Krivokapic et al. 2016), but studies using more naturalistic data hint that they may not be (e.g., de Ruiter & Wilkins 1998). Furthermore, whether gestures are integrated with utterance-level prosodic structure has not, to our knowledge, been studied. Our results show that pointing gestures, at least, are not integrated with utterance-level prosodic structure in the same way that pointing signs are. Here again, it is possible that this contrast between pointing signs and pointing gestures is due to the fact that pointing signs are produced in the same articulatory channel as the rest of the referential content. That is, since pointing signs and the other signs of sign language are produced by the hands, pointing signs may inherit some characteristics of the rhythmic structure of sign language. The interesting observation is that pointing gestures do not inherit these same characteristics even though we know that gesture duration is influenced by the language it accompanies. For example, how long a gesture is held is often dictated by the duration of co-occurring speech (Park-Doob 2010). Thus, gesture duration is, at times, influenced by aspects of the spoken signal despite the fact that the signal is in another modality. Our findings highlight a fundamental difference in the way that points are integrated into signed vs. spoken language systems. Why the duration of pointing gestures is not influenced by utterance-level prosody is a question for future work.
5.4 Are pointing signs more linguistic than pointing gestures?
On balance, our findings suggest that pointing signs are different from pointing gestures in that they are more conventionalized, reduced, and integrated than pointing gestures. These patterns are consistent with the hypothesis that pointing signs may have undergone these changes in form as they became more linguistic. However, it is also possible that pointing gestures fail to exhibit these patterns, not because they are not linguistic, but because they are produced by a different articulator than speech. In other words, some of the differences we see between pointing signs and pointing gestures may stem from the fact that pointing signs are produced in the same channel as other referential content, whereas pointing gestures are produced in a different channel from other referential content. When produced in the same channel as other referential content, points may take on sign-like features. For example, pointing signs may favor the dominant hand because most of the referential content in sign is produced with this hand; self-points may contact the chest because, generally, signs produced at the chest tend to contact the chest (cf. McBurney 2002); and pointing signs may be shorter since they must occur in sequence with other referential content.
This “same channel” hypothesis could be investigated in several ways. First, it might be possible to investigate how points change in form as sign languages mature, as in NSL (Coppola & Senghas 2010). If the “same channel” proposal is correct, some properties of pointing signs, such as reduction, should be evident at the very earliest stages of the language. However, if the reduction we observed is chiefly due to diachronic processes such as grammaticalization or increased frequency, then this reduction ought not be present immediately, but should emerge as the system develops and is transmitted across generations of users. Of course, since these possible mechanisms are not mutually exclusive, it may be that pointing signs are significantly reduced immediately and also undergo further reduction as the system matures.
Second, it is possible to investigate these proposals experimentally. Specifically, we could compare co-speech points with the points produced as part of silent gesture—that is, the gestures hearing speakers produce when asked to use only their hands to communicate (e.g., Goldin-Meadow et al. 1996; Goldin-Meadow et al. 2008). Again, if the “same channel” proposal is correct, points in silent gesture should be immediately reduced; but if diachronic processes are critical, points in silent gesture ought not be immediately reduced, but should become reduced as the communication system is transmitted across users. Interestingly, studies involving silent gesture have found some evidence for reduction, even over the course of a single experimental session (Namboodiripad et al. 2016). We have also suggested that the phrase-final lengthening observed in pointing signs, but not pointing gestures, could be due to the “same channel” constraint. However, since it is not known whether phrase-final lengthening is due to diachronic processes or synchronic processes, the predictions for silent gesture are less clear. One possibility is that this feature would not be present immediately but would emerge as the silent gesturers begin to construct a linguistic gestural system (see Goldin-Meadow 2015).
Returning to the question put forth at the beginning of the paper: how do the signs of sign language differ from the gestures that speakers produce when they talk? Our findings have demonstrated that, although points in sign language and points in co-speech gesture appear superficially similar, there are important differences: pointing signs are formationally more consistent, more reduced in form, and more integrated into prosodic structure, than are pointing gestures. Pointing signs are thus integrated into sign language in a fundamentally different way from the way pointing gestures are integrated into spoken language, perhaps because pointing signs have undergone changes associated with linguistic systems. Alternatively, the differences between pointing signs and pointing gestures may stem from the fact that pointing signs are produced with the same articulators as the signs they accompany, whereas pointing gestures are not. We have suggested further research that could help distinguish between these hypotheses.
In conclusion, to better understand the relation between sign language and gesture, it is important to characterize and quantify the differences between the two with respect to a range of features and a large number of individuals, as we do here. In focusing on a range of features, we can see that pointing signs and pointing gestures are different despite being superficially similar. In particular, pointing signs are more constrained, and the sources of these constraints (i.e., whether due to linguistic processes such as grammaticalization or to producing points with the same articulators as other signs) warrant further research.