1 Introduction
Sign language linguistics is still a young subfield of linguistics (McBurney 2012), but one topic that has received attention for quite some time within this subfield is sociolinguistic variation (Bayley et al. 2015). Within this area of sign language research, there has been a particular focus on lexical variation – that is, the use of different sign forms for the same meaning across groups of signers within the same sign language community (e.g., Bickford 1991; Lucas et al. 2003; Stamp et al. 2014; Safar 2021; Lutzenberger et al. 2023). Less attention has been given to sublexical variation (see, e.g., Israel & Sandler 2011; Lutzenberger et al. 2021; Becker et al. 2023), which has been a more frequent topic in spoken language linguistics (see, e.g., Labov 1972; Eckert 2012).
That Swedish Sign Language exhibits lexical variation has been known for a long time, being mentioned in some of the earliest sign language research in Sweden from the 1970s:
“In [Swedish] Sign Language, some dialectal differences can be distinguished and at every deaf school there is furthermore a characteristic sign vocabulary. In what sense the dialects differ has not been investigated, but it is likely that it concerns the choice of signs rather than differences in sentence structure.”
(Bergman 1977: 11–12) [Our translation from Swedish]
More recently, lexical variation in Swedish Sign Language can be seen in the largest dictionary of the language, the Swedish Sign Language online dictionary (Svenskt teckenspråkslexikon 2024). As of today, over 20,000 signs have been documented and published in this dictionary publicly online, many of which are form variants expressing the same meaning. For some of the sign entries, signs are marked as specific to a geographic region1 or as being old-fashioned.2 Within the Swedish deaf community, there is metalinguistic awareness of lexical variation in the language, with mentions of the type “this is how my (grand)parents used to sign X” or “this is the sign they use for X in Gothenburg” being encountered in conversations, online discourse and popular science discussions about language. Yet, there has not been any systematic investigation of lexical variation in Swedish Sign Language, and it is not known how prevalent variation is in the language community today.
In this paper, we aim to address this issue by looking at lexical variation in Swedish Sign Language (STS; Swedish: svenskt teckenspråk) from two angles: first, we make use of a designated survey-based study in two parts (in-person vs. online data collection) to explore specific domains known to exhibit lexical variation (e.g., signs expressing colors, numbers and toponyms); second, we turn to the STS Corpus to see if similar lexical variation patterns emerge from (semi-)spontaneous interactive discourse. We are guided by the following research questions:
i. What is the degree of lexical variation in STS in terms of the diversity of sign variants used for the same meanings?
ii. Is the lexical diversity in STS changing over time, as exhibited across age groups?
iii. How much do signers overlap in their lexical choices – that is, to what extent do signers prefer the same sign variant for a specific meaning?
iv. Do STS signers who have similar social profiles (e.g., age, gender and school background) exhibit more overlap in their lexical choices than signers who do not?
v. Are there patterns of lexical choices across age groups for specific target meanings/domains in STS (e.g., a preference for borrowed signs for countries, or one- vs. two-handed forms)?
We hypothesize that the amount of lexical variation in STS is decreasing, based on our experience with the language and the higher degree of contact across STS signers today as a result of technical advances and increased uniformity in the educational system with regard to schools. This hypothesis builds on the authors’ combined experience with STS and the Swedish deaf community: three of the four authors are deaf signers who acquired STS as their first language from birth – all born to deaf parents – and who have documented and taught STS language and linguistics for several decades each; one author is hearing and has over 15 years of experience with linguistic research on STS.
In the following, we will outline some of the previous research on lexical variation in different sign languages (Section 2) and describe our methods for researching lexical variation in STS using both survey and corpus data (Section 3), before presenting our results (Section 4), followed by a general discussion of our findings (Section 5).
2 Background
2.1 Motivations behind variation
For a long time in the history of linguistics, signed languages were mostly ignored from systematic linguistic study (McBurney 2012), likely due to a combination of smaller community sizes and the marginalization of deaf people, and their languages not being recognized as fully fledged languages. Sign language linguistics as a research area emerged in several places around the world in parallel in the 1960s–70s, which led to many sign languages being described systematically in terms of structural features. More recently, technical advancements (e.g., improved video recording/storage and annotation tools) have facilitated the collection and study of sign language data (Fenlon & Hochgesang 2022).
In some of the earliest work in sign language linguistics, the first steps included collecting signs and teasing apart their phonological structure by formally describing their sublexical form components and identifying minimal pairs (e.g., Stokoe 1960; Bergman 1977) – see also Österberg (1916) for an early description of STS signs and structure. By collecting and comparing signs, it becomes apparent that some meanings can be expressed by multiple sign forms within a sign language community, and that the variation may be associated with sociolinguistic groupings. For many sign languages, the establishment of the earliest deaf schools marks the start of the (traceable) conventionalization of the language. Deaf education led to physical meetings of otherwise isolated and geographically dispersed deaf children, at a common location often represented by a (residential) school for the deaf (see, e.g., Brentari 2010). Because of this aggregation of deaf children in schools, which in many countries involved multiple schools across different regions, new linguistic traits specific to each school quickly evolved, leading to “dialects” that were consequently associated with schools rather than a geographic region per se. School-based lexical variation was noted in early lexicographic work on, e.g., American Sign Language (ASL), where schools were said to develop their own unique signs (Stokoe et al. 1965).
Lexical variation within a sign language has been noted in many different sign languages to date (e.g., Bickford 1991; Lucas 2003; Schermer 2004; Vanhecke & De Weerdt 2004; Hanke 2016; McKee et al. 2011; Vermeerbergen et al. 2013; Stamp et al. 2014; Palfreyman 2019; Chen & Gong 2020; Safar 2021; Horton 2022; Hanke et al. 2023; Lutzenberger et al. 2023; Becker et al. 2023). The motivations underlying lexical variation can depend on a number of different factors. While regional variation – whether school-based or not – is a major factor across many sign languages, there is also variation based on factors such as race – e.g., in ASL (Lucas et al. 2009; 2003) – and gender – e.g., in Irish Sign Language (LeMaster 2000) – although these factors are often directly or indirectly related to deaf education, through school segregation of deaf children based on race or gender.
Some researchers have noted that smaller, more local sign language communities – e.g., those sometimes labeled village or rural sign languages (see Hou & de Vos 2022) – seem to exhibit more variation than larger – often labeled urban or deaf community – sign languages that are generally associated with formal deaf education. For example, in Al-Sayyid Bedouin Sign Language, the extensive variation in form across individual signers was claimed to show that the language had not yet conventionalized in terms of a phonological system (Israel & Sandler 2011), and that the size and composition of the language community allow for more variation since meaning can be deduced from shared context and iconicity (Sandler et al. 2011; Meir et al. 2012). Sandler et al. (2011) argued that there were tendencies of familylects, such that members of the same family use similar sign forms, indicating seeds of conventionalization at the level of family-groupings. More recent work on smaller sign language communities in Indonesia (Bali) (Lutzenberger et al. 2023), Guatemala (Horton 2022) and Mexico (Safar et al. 2018; Safar 2021) have shown that there is sociolinguistic structure with regard to within-language variation, which can be disentangled by looking more closely at the social networks of interaction among community members. Lutzenberger et al. (2023) showed that a larger (urban) sign language like British Sign Language (BSL) exhibits more lexical variation overall than Kata Kolok, a village sign language in Bali, but that BSL has more distinct subgroup clusters – i.e., groups of signers with more similarity in signs used, which as groups are more distinct from other groups. Thus, the authors argue against the claim by Meir et al. (2012) that community size directly correlates with lexical variation – that is, the claim that larger communities exhibit less variation is not necessarily generally true.
It is well established that variation in language is influenced by factors both within and outside of the language community. For sign language communities, language contact is known to influence lexical variation. Such lexical influence can stem from different language communities being in direct contact with each other, for example, between neighboring languages (see Quinto-Pozos & Adam 2015) or when – often larger or more dominant – sign languages are learned and used in and for contact situations (see Hiddinga & Crasborn 2011). For instance, the impact of ASL as a global lingua franca (see Kusters 2021) can have an effect on the lexicons of other sign languages, with borrowings – whether conscious or not – entering the language directly or indirectly (i.e., through intermediate languages), something which has been noted in New Zealand Sign Language (NZSL) (McKee & McKee 2020; McKee et al. 2021). Additionally, in many sign language communities, there have been deliberate attempts to standardize the language, often relating to unifying the lexicon, i.e., trying to decide on a single sign form per meaning (McKee & McKee 2011; Adam 2015). However, convergence or leveling may happen even without explicit standardization attempts, through increased mobility and interaction across groups (e.g., dialects) within the community. This was found for BSL by Stamp et al. (2015), who noted a decrease in the use of traditional, regional sign variants among younger signers. Thus, there are many factors underlying lexical variation in a language, but also many reasons for why the lexical variation may change over time.
2.2 Approaches to lexical variation research
In previous research on lexical variation in different sign languages, sometimes specific domains of the lexicon have been targeted, often due to prior knowledge of that domain exhibiting some variation. Some examples of this are numerals (McKee et al. 2011; Stamp et al. 2015; Safar et al. 2018), color terms (Langer 2012), or a mixture of domains (Stamp et al. 2014; 2016; Zeshan & Sagara 2016; Mudd et al. 2020). Variation within a domain may be explained on the basis of sociolinguistic factors – e.g., age or networks of social interaction – but there could also be other motivations for certain concepts exhibiting a higher number of form variants than others. For example, in, e.g., Japanese Sign Language, lexical variation has been shown to correlate with the color-term hierarchy, such that the more basic color terms, higher up in the hierarchy (e.g., ‘black’), exhibit less lexical variation – i.e., fewer sign variants used for them in a language – and those further down in the hierarchy (e.g., ‘purple’) exhibit more lexical variation (Sagara & Zeshan 2016: 20).
Studies on lexical variation in sign languages have used different methods for collecting adequate data. While some of the studies have been based on corpus data, these tend to involve a special subset of the corpus that was collected through a designated elicitation task, intended to target certain items/domains potentially exhibiting variation – e.g., in BSL (Stamp et al. 2014; 2015) and German Sign Language (DGS) (Langer 2012; Hanke et al. 2023). However, few studies have used the full extent of more naturalistic conversational corpus data to get at more nuanced variation, such as within-signer variation due to phonological, morphological or syntactic context or pragmatics (cf. Bayley et al. 2015). Other studies have collected lexical items through an independent elicitation task, such as a picture-naming task in the field – e.g., in Kata Kolok (Lutzenberger et al. 2023) and Yucatec Maya Sign Languages (Safar et al. 2018; Safar 2021). Yet other work in this area has used crowdsourcing in the community through online interactions, such as eliciting video responses through online surveys – e.g., in DGS (Wähl et al. 2018), Russian Sign Language (Kimmelman et al. 2022) and STS (Kankkonen et al. 2018).
A challenging aspect of lexical variation research is how to compare and quantify the similarity/difference in form within and across individual signs. Traditionally, in sign language linguistics, sign form comparisons have been based on counting the number of phonological parameters (i.e., hand configuration, place of articulation and movement) that two signs share, an approach that has been employed when comparing historical change within – as well as lexical overlap (and potential relatedness) across – sign languages (see Power 2022). One issue with this method is that it disregards potentially shared iconic motivations – i.e., the sign forms being different but representing the same form–meaning mapping. This has led some researchers to additionally take iconicity into account, alongside form parameters (e.g., Konrad 2013; Ebling et al. 2015; Mudd et al. 2020). Besides comparing (dis)similarity between pairs of signs based on their form parameters, Mudd et al. (2020) use entropy as a diversity index of sign forms with the same meaning across groups of Kata Kolok signers. Additionally, they estimate the lexical distance between all signers through a pairwise comparison of each signer’s set of sign variants. Thus, Mudd et al. (2020) employ two metrics that target the variability of sign forms within the language, particularly between deaf and hearing signers, as well as the distance in lexical choices between individual signers. While the entropy scores do not show any differences between deaf and hearing signers of Kata Kolok, nor with regard to age (i.e., no visible change in variation over time, such as leveling), the lexical distance scores show some tendencies of deaf and hearing signers being clustered closer within than across the deaf vs. hearing groupings (Mudd et al. 2020: 70–77).
An additional challenge is that it is difficult to apply categorical boundaries to similar sign forms based only on shared, broad form parameters. For instance, it has been suggested that multiple sign forms denoting the same meaning within a sign language can be defined as either phonological variants that belong to a single lemma if they only differ in a single form parameter, but lexical variants if they differ in more than one form parameter (Cormier et al. 2012) – see also Johnston & Schembri (1999). However, Kimmelman et al. (2022) showed that many signs in Russian Sign Language cannot be categorized based on these criteria, because many form variants are related in chains or complex grid constellations, in the sense that signs A and B and signs B and C would constitute phonological variants, but not signs A and C. Thus, sign variants can be connected in more intricate relational networks than the simple phonological vs. lexical distinction defines. The STS dictionary (Svenskt teckenspråkslexikon 2024) uses a categorization similar to that of, e.g., Cormier et al. (2012) by listing phonological variants as form variants under the same dictionary entry (which can be seen in a pop-up window by clicking the link “phonological variants”), and lexical variants as their own entries – see Figure 1.
Figure 1: Screenshot of the STS dictionary (Svenskt teckenspråkslexikon 2024: 2751) showing how phonological variants are listed in a pop-up window under the main entry (here: the sign for ‘red’), whereas lexical variants are separate entries through a link to “other signs with the same meaning”.
However, we will in this study focus on whether or not a variant is at all present in the dictionary, regardless of its listing as an independent (“lexical”) entry or as a form (“phonological”) variant. This is for three reasons: first, to avoid problems due to potential inconsistencies in the lexicographic classification of entries in the dictionary; second, to avoid the categorization difficulties regarding interconnected chains of sign forms; and third, to reflect the nature of the survey design, in which phonological and lexical variants are not distinguished, since it is simply based on available dictionary entries, whether variants or not (see Section 3.1). Thus, we will refer to all phonological and lexical variants of the same meaning simply as sign variants.
2.3 Variation in STS
As illustrated by the quote from Bergman (1977) in the introduction (Section 1), dialectal lexical variation in STS has been known for a long time, with the variation being attributed mainly to the (historical) deaf schools in Sweden (see Figure 2 for a map of the Swedish deaf schools). While lexical variation in STS is known among the signing community members, and is something that may be explicitly mentioned in STS classes, there has surprisingly been very limited research into this area. The one exception is a study by Börstell & Östling (2016), where the authors used STS Corpus data to find examples of signs with skewed frequency distribution based on signers’ age, gender and regional background, and text type. While the authors identified a few sign distributions that conformed to expected patterns – e.g., used in a certain region or by a certain age group – the corpus dataset proved too small to find any larger number of clear cases of sociolinguistic variation. Börstell (2024a) looked at some examples from Börstell & Östling (2016) with a larger dataset of the STS Corpus after more data had been annotated, and could corroborate previous findings. For example, signers’ age was shown to be an influential variable for finding examples of differences in the relative frequency of individual signs or variants. However, these two studies mainly aimed to develop methods for finding and illustrating potential variation patterns in sign frequencies, but did not arrive at any conclusive results in terms of the overall sociolinguistic characteristics of lexical variation in STS. Furthermore, both studies mention the difficulty of investigating lexical variation with a corpus of the size of the STS Corpus, which is small in relation to many corpora of spoken languages – see also Fenlon & Hochgesang (2022) and Kopf et al. (2021) for an overview of sign language corpus resources.
Bergman (1977: 11–12) hypothesized that the dialectal differences in STS would be confined mostly to lexical variation rather than structural differences. To date, there has not been much research on other types of sociolinguistic variation in STS other than limited investigations of lexical variation (i.e., Börstell & Östling 2016; Börstell 2024a). Two corpus-based studies on sign duration and signing rate have pointed to age influencing the rate of articulation, but this could be attributed to the physiological effect of aging rather than being social identity markers, and gender and regional background did not affect the speed of signing in the way that age did (Börstell et al. 2016; 2024). To what extent the Swedish deaf schools are still potential sources of lexical variation in STS is more unclear, particularly seeing as Swedish deaf education has changed in recent decades, with fewer students in deaf schools and less emphasis on STS in the classroom (see, e.g., Schönström & Holmström 2021), in parallel with an increased use of cochlear implants. However, immigration of deaf people from other countries (see Duggan et al. 2023) may directly influence STS change through language contact.
In this study, we focus on lexical variation with regard to the choice of sign variants as one aspect of sociolinguistic variation in STS, through a combination of analyzing targeted survey data and cross-validating patterns with corpus data. The data collection and analysis methods are described in Section 3.
3 Methodology
3.1 Survey design
In order to target specific domains that are expected to exhibit lexical variation in STS, we designed a survey in an online browser-based interface. The survey consists of a consent form and a number of metadata questions about the respondent’s age, gender and regional background (deaf school and residence) followed by 45 questions, each targeting an individual meaning/concept within five categories: colors, numbers, countries, cities and other (see Appendix A for the full list of meanings). Each category contains a number of meanings within that domain for which there are multiple sign variants documented, with most meanings containing more than one sign variant in the STS dictionary, from which videos shown in the survey were taken. The number of videos available per target meaning ranges from 1 to 6. For instance, the meaning ‘twenty-eight’ is represented by only one sign variant video in the survey, whereas ‘Mexico’ is represented by five sign variant videos – see Appendix A for the exact number of videos per meaning. Whereas four of the five meaning categories are straightforward, the category other contains a diverse set of meanings. The meanings in the other category were chosen to capture sign variation with regard to number of hands used, as there are known one- vs. two-handed variants for each meaning. There is a perception in the STS community that younger STS signers may favor the (symmetrical) two-handed sign variants for these meanings, which could be tested empirically with this survey.
In the survey itself, the task was introduced to participants only at the beginning of the task and was formulated as “We investigate which sign you use for each Swedish word” in written Swedish. After this, each meaning was displayed on its own page, represented by a Swedish word above one or more sign videos, each with an associated response button (see Figure 3). Each video demonstrates a sign variant from the STS online dictionary (Svenskt teckenspråkslexikon 2024) for that meaning, with the videos auto-played and looped continuously. As noted in Section 2.2, we do not treat so-called phonological and lexical variants differently, so any and all sign variants found in the STS dictionary for each meaning were included as response options in the survey. For example, whereas the question about the sign for ‘gray’ shows two sign variants, differing in handshape (so-called phonological variants), the question about the sign for ‘purple’ (Figure 3) shows two videos differing in both handshape and movement (so-called lexical variants), but the survey treats all variants the same, each being displayed as its own video on the screen. Below the videos, there are two free-text fields: one for manually entering a different form by typing a description; the other for additional comments. Figure 3 shows the design of the question about ‘purple’, which has two variants as video-illustrated response options, whereas other meanings had up to six variants displayed as videos – see Appendix A for the number of variants per meaning. A few meanings have only one sign variant displayed as a video. These single-video items were included as a way to collect other possible variants not already documented in the dictionary – either through free responses in the elicitation task (see Section 3.2.1) or from any Other-descriptions given by participants in the online data collection who did not select the sign variant displayed in the video (see Section 3.2.2).
Figure 3: Screenshot of the online survey interface. The target meaning is shown top left (here: LILA ‘purple’), followed by a two-column grid of sign variant response options shown as looped, auto-played videos. Free-text response fields for Other (Annat; for a description of a different form) and Comment (Kommentar; for additional comments) at the bottom.
3.2 Survey data
The survey was used to collect data from members of the STS community in two ways: first, a number of participants were recruited during several in-person elicitation sessions at deaf events, during which an experimenter filled out the survey; second, a call for participants was distributed through social media, thus requiring participants to follow a link and subsequently respond to the online survey themselves. We refer to these two recruitment methods as elicitation (Section 3.2.1) and online (Section 3.2.2), respectively, and their differences affect our further processing of the data (Section 3.4).
3.2.1 Elicitation data collection
The elicitation data was collected at various deaf events, such as senior meetings at local deaf clubs and a national deaf convention. Before the elicitation data collection sessions could be carried out, key contact persons were approached – for example, the chairperson of a local deaf club. We asked whether we could give a presentation on sign language variation and ongoing research at Stockholm University, and conduct a study by collecting data from the attendees for our study on sign variation. The experimenters are part of the deaf community and could answer questions from potential participants, who were at times concerned about the formal setting surrounding the survey, and could clarify that the intention was documenting variation, not to administer a test of sign language proficiency. At these events, the elicitation session was preceded by a general presentation that included a self-introduction by the experimenter(s), followed by a lecture on deaf Oskar Österberg’s early work on the documentation of STS signs (see Österberg 1916), followed by a discussion about the need to revisit sign variation in STS today. Consequently, participants were recruited on a voluntary basis among those attending these events.
At the elicitation session, participants were asked to sit down with an experimenter (always a deaf STS signer, one of the co-authors) one at a time. The experimenter had a laptop with the online survey open, the screen not visible to the participant. The participant was shown a series of pages in a binder, each page showing a single meaning (see Appendix A). The meanings were represented as Swedish words in a fixed order, and in the case of the colors and countries they were also accompanied by a picture showing the color or the country’s flag. The participant was asked in STS to provide a sign for each meaning, but could skip any item they could not recall or name a sign for. Thus, participants did not themselves see the sign variants shown in the videos in the survey, but rather produced a form based on the word (and, in some cases, image) stimuli. If the participant produced one of the sign variants present in the survey for that question, the experimenter would select that sign variant as the response, but if they produced a different variant, the experimenter would record a form-descriptive gloss in the Other text field. Only a single sign variant was recorded for each meaning and participant, and this would be the sign variant that the participant indicated they use nowadays.
3.2.2 Online data collection
For the online data collection, participants were recruited through a call distributed on social media. Target criteria for participation were that the person was at least 18 years old and had grown up and attended a deaf school in Sweden. Since the elicitation data had garnered many responses from older groups of signers, the online recruitment particularly targeted those between the ages of 18 and 40 in the call for participants, but we did not restrict older participants from responding to the online survey.
The same survey was used as for the elicitation data, only the participants were taking the survey individually. Thus, they were in control of advancing through the questions, they could view the videos showing the sign variants themselves and were free to select which of the variants they would use for the specified meaning. If they did not use any of the sign variants shown, they could manually add another form by describing its form in the Other field or by adding comments in the separate Comments field. Thus, the main difference between elicitation and online data collection is whether or not the participants were exposed to the sign variants in the videos before responding.
3.2.3 Survey data summary
In total, well over 200 participants responded to the survey, either in person or online, but with further exclusion criteria of the respondents, the survey data reported in this paper comprises responses from 170 participants – see Table 1. In Section 3.4, we describe the data processing and exclusion criteria of items and participants in detail.
As Table 1 shows, the distribution of age groups is complementary between the elicitation and online data collection. This is due to the fact that mainly older signers participated at the deaf events at which the in-person data collection took place, whereas younger signers were targeted for the online collection to compensate for the initial older age skew. While a few younger signers participated in the elicitation sessions and a few older signers participated online, we decided to filter the two groups to simultaneously split data collection method and age groups cleanly, in order to analyze each dataset separately. Out of the 170 participants, 99 (58%) identified as women, 65 (38%) as men, and 6 (4%) said “other” or did not respond.
Table 1: Number of respondents by data collection method and age group.
| Method | Age group | n |
| elicitation | 1931–1940 | 12 |
| 1941–1950 | 16 | |
| 1951–1960 | 19 | |
| 1961–1970 | 15 | |
| online | 1971–1980 | 15 |
| 1981–1990 | 66 | |
| 1991–2004 | 27 | |
| Total | 170 |
3.3 STS Corpus data
For the corpus data, we used the STS Corpus data (Mesch et al. 2012) as stored in The Language Archive,3 comprising 298 ELAN (.eaf) files (Wittenburg et al. 2006) with 189,679 annotated sign glosses produced by 42 signers from three major regions of Sweden (see Figure 2), engaged in semi-spontaneous conversation or stimuli-prompted narrative retellings. Although the STS Corpus data is relatively large for a sign language corpus (cf. Kopf et al. 2021), it is often too small to find many solid examples of lexical variation (cf. Börstell & Östling 2016; Börstell 2024a). Thus, we mainly use the corpus data as an additional resource to supplement the targeted survey data and to further evaluate the potential of using the STS Corpus for lexical variation research.
3.4 Data processing
The survey data was exported into an Excel (.xlsx) format and manually processed in order to enrich the data with annotations and to link manually entered variants to the correct dictionary entry. The enrichment consisted of annotating additional features of individual lexical items, such as categorizing sign variants for the same meaning with information about a specific trait that we wanted to investigate further. For example, for some of the countries, we were aware that there were both older sign variants as well as newer (borrowed) sign variants, so each variant was annotated as borrowed or not borrowed. In other cases, the distinctive feature of the sign variant was whether or not the movement was repeated, or whether it used one or two hands. In terms of linking the manually recorded sign variants, this was done to mitigate the impact of the differences between the elicitation and online data collection designs. In the survey data, many of the sign variants manually recorded by the experimenter were in fact phonetic variations of a sign variant represented by a video in the survey. In such cases, all manually annotated variants – as well as descriptions/comments added by online participants that were sufficiently informative – that could be linked to a sign variant in one of the videos were annotated as such. For instance, if an Other response contained the information “Alternative 1, but articulated on the contralateral side”, then the response would be annotated with the dictionary ID of the sign variant shown as “alternative 1” for that question. That is, all responses that could be linked to a unique dictionary video present in the survey were annotated as belonging to this dictionary ID. For cases in which the free-text response was not possible to interpret or link to any of the stimuli video, it was discarded. Thus, the data analyzed in this study relate to which survey video is associated with a participant’s response, whenever possible. This was done in order to allow for better comparability across the two methods, seeing as online participants are likely to disregard minor phonetic differences and pick the best fit out of the available options. The list of survey items in Appendix A shows the number of sign videos that were shown with each meaning. For the analyses in this paper, we have excluded the meanings that only had a single sign video associated with them, since these do not invite a lot of variability.4 Thus, only 38 meanings (out of 45) are included in the current study (i.e., items with two or more associated sign videos), after removing the seven meanings with only one sign video response option each in the survey.
In order to calculate the degree of variation across both meanings and participants, we use two different metrics of comparison: diversity index and lexical overlap.
The first metric we use is Shannon’s (1948) diversity index, which is a measure of diversity (or, entropy) of types and their distribution in a dataset. We calculate the diversity index as:
In this formula, R is the richness – i.e., the number of unique variants in the group – and pi is the proportion of a single sign variant’s frequency relative to all the sign forms recorded for a specific meaning.5 For example, the meaning ‘ten’ has two sign variants in the survey, for which the variant TIO(Y) occurs 9 times and the variant TIO(55) occurs once in the oldest age group (1931–1940; n = 12). This can be used to calculate a diversity index as:
Thus, in this example, the diversity index for the meaning ‘ten’ is about .811 for the oldest age group. Had all ten occurrences been of the same sign variant (as it is for the age group 1961–1970), the diversity index would be 0. We use this index to estimate the degree of lexical diversity within meaning domains (categories) and groups of signers (e.g., age groups), following Mudd et al. (2020) in their research on Kata Kolok. The diversity index used in this way, calculated across the proportional frequency of sign variants for the same meaning within some group, corresponds to the uncertainty of predicting a certain sign variant. For example, how (un)certain are we in predicting the sign variant used for the meaning ‘ten’ when sampling a single response from the oldest age group?
The second metric we use is an overlap coefficient, which is calculated between pairs of participants, comparing the degree of overlap in their responses. This means that for each pair of participants, we take the length of the intersection of their responses – i.e., the number of shared sign variants in their responses – divided by the length of the smallest of the two sets of responses between them. For example, if participant A is reported using sign variants RED1, BLUE1, GREEN2 and PURPLE2, and participant B is reported using sign variants RED1, BLUE2 and GREEN2, the intersection of their responses consist of two shared variants (RED1 and GREEN2; i.e., a length of two), and the shortest set is that of B (who has responses only for ‘red’, ‘blue’ and ‘green’; i.e., a length of 3), which gives an overlap coefficient of or about 0.67. Two participants with no shared sign variants would get an overlap coefficient of 0, and two participants who share sign variants for all meanings for which they can be compared (i.e., that have responses from both) would get an overlap coefficient of 1. We use this metric to allow for variability in the number of comparable responses across participants without penalizing pairs in which a participant skipped more questions than others.
Processing, analysis and visualizations of the data were done with the programming language R v4.5.1 (R Core Team 2025) and the packages ggdist v3.3.3 (Kay 2023), ggtext v0.1.2 (Wilke & Wiernik 2022), glue v1.8.0 (Hester & Bryan 2024), here v1.0.1 (Müller 2020), lme4 v1.1.37 (Bates et al. 2015), marginaleffects v0.28.0 (Arel-Bundock et al. 2024), readxl v1.4.5 (Wickham & Bryan 2025), rnaturalearth v1.1.0 (Massicotte & South 2025), scales v1.4.0 (Wickham et al. 2025), sf v1.0.21 (Pebesma & Bivand 2023), signglossR v3.0.0 (Börstell 2022), swemapdata v0.0.2 (Börstell 2024b), tidysigns v0.2.2 (Börstell 2024c) and tidyverse v2.0.0 (Wickham et al. 2019).
4 Results
4.1 Lexical diversity
Relating to our first two research questions, we wanted to explore the degree of lexical diversity in our survey data. We use the diversity index described in Section 3.4 to do so, which is a measure of the uncertainty in predicting a specific sign variant from a random sample of all responses to a single meaning. First, we calculate the diversity index for each meaning across all participants – that is, the variability and distribution of forms for each meaning. Figure 4 shows the distribution of diversity by meaning and category, with each dot representing the diversity index of a single meaning across all participants, split by data collection method (elicitation vs. online). As Figure 4 shows, all meaning categories exhibit a degree of variation, which is perhaps unsurprising since the list of meanings (see Appendix A) was sampled with variation in mind in the first place. Among our sampled meanings, there is generally lower diversity in the City category, but higher diversity in the Country and Numbers categories, despite City being the category with the most available variants per meaning on average in the survey (M = 4; SD = 1.5; see Appendix A). This suggests that while there are many sign variants documented for cities in the STS dictionary, most signers appear to converge on a few variants in their responses. Across the data collection methods, it seems as though the diversity is higher in the elicitation data (i.e., in the older age groups) compared to the online data (i.e., younger age groups), but the general pattern across meaning categories is fairly similar. Due to the targeted sampling of meanings that are expected to exhibit lexical variation and the relatively few meanings per category, we cannot draw any major conclusions from these distributions, but we will, however, return to the Country category in Section 4.3.
Figure 4: Distribution of lexical diversity by meaning category. Each dot represents the diversity index of one meaning across all participants by data collection method. Whiskers show the 66% and 95% intervals, with the category median as a dot. Dashed lines show medians across categories by method.
Besides looking at lexical diversity across all participants, as in Figure 4, we also wanted to explore whether lexical diversity in STS is changing over time. Figure 5 shows the diversity index of all meanings (represented by dots) by the age group of the participants. To investigate possible changes in lexical diversity over time, we fit this data with two separate linear mixed effects models, one for each data collection method (elicitation and online). For each model, we take the diversity index as the outcome and age group as the predictor, with meaning as a random effect. The predicted diversity index of each age group for each model is shown in Figure 6, which visually indicates a slight but not entirely linear decrease in diversity over time. However, we should interpret the results across models (and thus age groups across methods) with caution, since the data collection design was very different for the elicitation compared to the online data. In fact, when comparing these models each to a null model with a likelihood ratio test, we do not see a significant effect of age group on the diversity index: elicitation data: ; online data: .
We were concerned that the difference in age group sizes might skew the diversity index metric, considering some groups had many more participants than others (particularly 1981–1990; see Table 1). As an additional precaution, we validated our statistical models by bootstrapping 1,000 resamples of the data for each data collection method (elicitation vs. online), with each age group being represented by a sample of 10 participants to ensure equally sized groups. By resampling with replacement (i.e., a participant could be drawn multiple times in a sample) and smaller groups, we expect estimates to be lower than observed metrics, but we can evaluate the diversity index by balancing group sizes. Figure 7 shows the results of our bootstrapping estimates from a linear model fit to each resampling iteration. These results pattern nicely with the model predictions in Figure 6, suggesting robust estimates despite imbalanced group sizes. Figure 7 shows that for each data collection method, the older age group(s) are above the dashed lines (the median estimate across age groups), whereas the younger age groups fall below the lines. Thus, the visual pattern suggests a possible decrease in lexical diversity over time, but we did not see a significant effect of age group in our initial models.
Figure 7: Predicted diversity index estimates by age group and data collection method from bootstrapping 1,000 resamples (with replacement) of equal size per age group (n = 10). Whiskers show the 66% and 95% intervals of the estimates, with the age group median as a dot. Dashed lines show estimate medians across age groups by method.
4.2 Lexical overlap across signers
For our third and fourth research questions, we look at lexical overlap between participants – i.e., the proportion of shared sign variants across their responses. We calculate the pairwise overlap between all pairs of participants (see Section 3.4). We then combine these pairwise comparisons with the metadata about each participant’s social profile with regard to age group, gender and school attended, and count how many features they share. That is, if participants A and B are in the same age group and of the same gender, but having attended different schools, they share two out of three features. Figure 8 shows the distribution of lexical overlap across all participants for each data collection method, by how many features the two participants in the pair share. Although very small, the distributions show a tendency to increase in lexical overlap for every added shared social feature. This suggests a cumulative effect of shared sociolinguistic belonging, in the sense that two participants of the same age and gender, who have attended the same school, are more likely to have a higher degree of lexical overlap than two participants who share fewer of these features. Thus, it seems there is a higher degree of lexical overlap within sociolinguistic groupings: the more similar two participants are, the higher the chances are that they will overlap in their sign choices.
Figure 8: Distribution of lexical overlap for pairwise comparisons of participants by number of shared social features and data collection method. Whiskers show the 66% and 95% intervals of the estimates, with the feature group median as a dot. Dashed lines show estimated medians across groups by method.
4.3 Patterns in sign variant choices
Having addressed the first four research questions in Sections 4.1–4.2, we now turn to some specific examples and patterns in sign variant choices. We investigate five different properties of lexical variation in turn: 1) movement repetition; 2) older vs. newer numerals; 3) one- vs. two-handed forms; 4) borrowed vs. not borrowed signs; and 5) potential age-related sign preferences in the STS Corpus.
4.3.1 Repeated vs. non-repeated movement
One distinction present in two of the color terms in our survey data concerns whether or not the movement is repeated. For the signs for ‘green’ and ‘red’, there are two different sign variants for each meaning that only differ in whether or not the movement is repeated. Figure 9 shows the distribution of the repeated vs. non-repeated forms across age groups for these two meanings, illustrating a pattern that may suggest that the sign variants with repeated movement are becoming relatively less frequent over time. However, we cannot know whether this is a more general pattern that extends to other meanings with a repetition-distinction in sign variants, or whether it is restricted to these specific forms. In fact, we can go to the STS Corpus to try to validate these patterns by looking at the usage in more naturalistic signing. Figure 10 shows the distribution of repeated vs. non-repeated variants for the same signs, including also the sign variants for ‘blue’ for which the non-repeated variants were not part of the survey sign videos. Figure 10 illustrates two important points: first, the pattern is much less clear than the survey data; second, the relatively low total numbers of occurrences mean that we have opted for absolute numbers of occurrences rather than proportional frequency distributions in the visualization. This demonstrates a major difference between the approaches, in that targeted data collection, whether elicitation or a self-paced survey, results in a much better coverage of individuals and their variation and stratification, whereas corpus data may fall short in terms of representation, due to low frequencies and dispersion of target items (see Börstell 2024a). Additionally, we do not know if there are usage differences in the choice of repeated vs. non-repeated sign variants for these and other meanings in STS, in that the prosodic or phonological context of a sign may push it towards one form due to assimilation or reduction, or that morphological or syntactic functions alter the likelihood of repetition, such as expressing plurality or emphasis. Thus, while we believe that the survey design is picking up some general preference or familiarity with regard to repeated vs. non-repeated variants for these meanings, we recognize that there is certainly individual and contextual variation at play, and that many sign variants should not be considered simply either–or.
4.3.2 Older vs. newer numerals
For the numerals included in the survey, there are two meanings that each have two variants that are often assumed in the STS community to be associated with different age groups, with one variant being considered older and the other newer – namely the sign variants for ‘ten’ and ‘ninety’. Figure 11 illustrates the two sign variants for each meaning, with older forms in the left column and newer forms in the right column. The letters in brackets of the sign glosses are form descriptions added to glosses to distinguish possible variants, generally handshape descriptors (such as in these cases).
Figure 11: The sign variants for ‘ten’ (top row) and ‘ninety’ (bottom row), by assumed older forms (left column) vs. newer forms (right column). Images from Svenskt teckenspråkslexikon (2024: 11951, 4475, 11914, 11955).
Figure 12 shows the distribution of the assumed older vs. newer sign variants for either meaning, showing a pattern that generally aligns with the expectations, at least when comparing across the data collection methods. For both ‘ten’ and ‘ninety’, there is indeed a pattern of the assumed older variant being relatively more frequent in the older age groups (elicitation data) compared to the younger ones (online data), but there is no obvious pattern of a gradual shift within each dataset. For ‘ten’, the distribution points to the older variant, the two-handed form glossed as TIO(55), having been more or less completely replaced by the newer variant, a one-handed form glossed as TIO(Y) – see Figure 11. The two-handed variant TIO(55) is a motivated form, putting the five fingers of each hand together to make the number ten, whereas the newer variant TIO(Y) comes from a sign originally referring to a currency amount, but which was later extended to simply mean ‘ten’.6 Based on Figure 12, it seems as though TIO(Y) has practically replaced the older variant in the general numeral expression ‘ten’. For ‘ninety’, the pattern might suggest that the older form could be gaining on the newer form in the younger age groups (online data). This is interesting, especially since the older form is more distinct from ‘nine’ than the newer form – i.e., increasing form-distinctiveness between meanings could be a benefit. However, we leave this question for a detailed, future study.
4.3.3 One- vs. two-handed forms
A number of meanings in the survey involve signs with both one- and two-handed form variants, in which the two forms are pairs of the same lexical variant (i.e., lemma). That is, the one- vs. two-handed forms appear in one or more pairs per meaning, in which the one- and two-handed forms in each pair are clearly related. Figure 13 shows the distribution of the one- vs. two-handed variants across meanings and age groups. From this graph, it seems as though there may be a trend in the proportion of two-handed variants becoming more frequent in younger age groups across several meanings (e.g., ‘level’, ‘new’ and ‘welcome’), but also more unclear patterns for other meanings (e.g., ‘princess’ and ‘common’). The pattern looks quite unpredictable within the elicitation data, but more uniform across meanings within the online data, which also has more respondents. For instance, for five of these six meanings (all but ‘welcome’), there is a gradual increase in two-handed forms across the age groups in the online data, although the proportion of two-handed forms varies across meanings.
We test whether age group has an impact on the preference for two-handed forms in these meanings by fitting two – one for each data collection method – mixed effects logistic regression models with two-handedness as the outcome and age group as a predictor (with meaning as a random effect). When comparing these models each to a null model with a likelihood ratio test, we do not see a significant effect of age group on whether the sign variant is two-handed in the elicitation data ( ), but we do for the online data ( ). Figure 14 shows the model predictions for the probability of using a two-handed sign variant for these meanings, by age group and data collection method. As the figure illustrates, whereas there is no clear pattern in the elicitation data, the online data suggests a gradual increase over time towards a higher use of two-handed forms for these meanings. Whether there is a general tendency for the youngest STS signers to prefer two-handed forms overall – i.e., if it is not only item-specific preference for these meanings, but generalizable to all/most pairs of “lemmas” with a one- and two-handed form variant – is left for future studies. That is, is this distinction and possible changes in preference relevant only for specific signs/meanings (e.g., sign A > sign B), or is it a general pattern of phonological change in the language (i.e., two-handed > one-handed)?
4.3.4 Borrowed vs. not borrowed country signs
Our survey also included signs for countries, which is a domain that is known to include borrowings, as signs for other countries are often borrowed from a sign language of that country/region (see, e.g., Matthews et al. 2009). We can use the survey responses to see whether there is an observable shift in preference for borrowed vs. not borrowed (for instance, older, inherited) sign variants for the countries.
For six of the countries, there are clear cases of borrowed vs. not borrowed sign variants represented in the survey. The borrowed sign variants are generally imported from the country in question, but may appear in multiple form variations for a single country – for instance, there are multiple variants for ‘Mexico’ in the survey labeled borrowed due to using either a repeated or non-repeated form of the Mexican Sign Language sign, whereas the not borrowed variants are forms used in STS before the Mexican sign was imported.
Figure 15 shows the relative distribution of borrowed vs. not borrowed sign variants for these six countries, by age group and data collection method. As Figure 15 illustrates, there appears to be an increase in borrowed sign variants across age groups for all six countries, but with quite different distributions. For instance, in the case of ‘Australia’ and ‘Italy’, there has been a total shift from a general preference of older (not borrowed) variants, to the borrowed variants. For ‘Mexico’ and ‘Russia’, there seems to be an increase in the borrowed variants, but without a general shift in the preferred variant overall – that is, the inherited variants are still dominant even among the youngest age groups. For ‘Israel’ and ‘Japan’, the pattern is not as clear, but possibly also pointing to a slow increase in the borrowed variants over time.
We test whether age group has an impact on the use of borrowed sign variants for these countries by fitting two – one for each data collection method – mixed effects logistic regression models with borrowed as the outcome and age group as a predictor (with meaning as a random effect). When comparing these models each to a null model with a likelihood ratio test, we see a significant effect of age group on whether the sign variant is borrowed in both the elicitation data ( *) and the online data ( ).
Figure 16 shows the model predictions for the probability of using a borrowed sign variant for these countries, by age group and data collection method. Thus, there is a statistically significant increase in the use of borrowed signs for – at least some – countries in STS, with younger signers being more likely to use borrowed signs for countries.
4.3.5 Age-related sign distributions in the STS Corpus
As mentioned in previous sections, corpus data can be challenging to use for lexical variation research when the corpus is relatively small (see Börstell 2024a), which is the case for practically all currently available sign language corpora (see Kopf et al. 2021; Fenlon & Hochgesang 2022). However, we explored data from the STS Corpus to try to find examples of age-related change in sign variant frequencies, by matching sign glosses (the annotations representing individual signs) across variants expressing the same meaning.
In the STS Corpus, sign variants of the same meaning generally receive additional disambiguating information in the gloss labels, usually through adding a descriptive identifier in brackets following the main gloss – this could be either a description of whether the sign is one- or two-handed, or which handshape is used in either form. We collected all the 189,679 sign tokens from the STS Corpus and separated the main gloss from any bracket-identifier in order to group sign variants belonging to the same meaning. After removing cases of depicting signs,7 inconsistent glosses and meanings with fewer than 30 tokens across variants, we end up with a list of 20 meanings for which there are two distinct sign variants for the same meaning.8
Figure 17 shows the distribution of sign variants for these 20 meanings across the age groups in the STS Corpus, only one of which is also represented in our survey data – namely the sign variants for ‘ten’. As can be seen from Figure 17, we see a similar pattern for ‘ten’ as we did in the survey data (see Figure 12), but with an even clearer shift in preferred variant over time, moving from a preference for the older two-handed form in the oldest age group (60–82 years old), to the one-handed form being preferred in the younger age groups, even the sole variant used in the youngest age group (20–39 years old). Additionally, Figure 17 shows that some other meanings may in fact be changing with regard to the preferred variant, for instance ‘other’, ‘become’, ‘stupid’, ‘alone’, ‘window’, ‘still, yet’, ‘enter’ and ‘embarrassing’ all showing a pattern of an incremental change across age groups, although not necessarily a complete switch in the overall preference. For example, the meaning ‘become’, with variants using either the index handshape BLI(L) or flat hand BLI(J), is changing in favor of the index variant BLI(L) becoming increasingly more common (especially dominant in the youngest age group, 20–39 years old). However, it is the more frequent variant across all age groups, albeit much more balanced in the oldest group (60–82 years old). Thus, there has not been any reversal of the dominant variant for ‘become’ within the time span of these age groups, but there is a visible directionality in the change towards a specific form becoming increasingly preferred. Such cases may be examples of changes still taking place gradually over the course of now-living generations. For the sign variants for ‘deaf’ – whether produced with a flat hand (J) or index finger (L) – the shift may have already happened earlier, with few uses of the flat-hand variant only seen in the oldest age group. An important thing to keep in mind here is that there may also be additional reasons for any signer to choose to use one form over another. Perhaps signers across age groups actively use both forms for these meanings, but that the choice relies on phonological or syntactic context, or that there are semantic/pragmatic differences in their usage. A detailed analysis is needed to tease apart such potential differences in distribution.
Figure 17: Relative frequency of sign variants with two forms across age groups (represented as age at the time of recording) in the STS Corpus. The tags following the glosses indicate form descriptors for each sign variant: ea for one-handed signs; ml for two-handed signs with a manual location; da for balanced two-handed signs; other letters refer to the handshape used in the sign.
In this approach, a visual representation of relative corpus frequencies may reveal patterns – or seeds of patterns – of change in lexical variation and preferences (see also Börstell & Östling 2016; Hanke 2016), which can be used in addition to targeted surveys of specific items of study (cf. Kankkonen et al. 2018; Wähl et al. 2018; Hanke et al. 2020). With the use of corpus data, one can also better quantify possible within-signer variation (whether signers actively use multiple variants) and whether there are contextual influences on the choice of variant, such as preferring specific forms in certain phonological (e.g., surrounding signs), structural (e.g., in compounds or for certain functions) or conversational (e.g., pragmatics and interaction) contexts. Such in-depth investigations should become easier with larger sign language corpora in the future, when there are enough occurrences across stratified groups and contexts within texts/conversations.
5 Discussion
With this study, we aimed to explore patterns of lexical variation in Swedish Sign Language (STS) using multiple approaches: through the use of a targeted survey of meanings known to exhibit variation, with both in-person and online participants, and with additional data from the STS Corpus. Furthermore, the survey data was approached in several different ways, looking at general patterns and changes of lexical diversity (entropy) across meanings and age groups, overlap in lexical choices across participants, and more detailed investigations of variation within individual domains (e.g., numerals or signs for countries) or form-pairs (e.g., one-/two-handed or repeated/non-repeated forms).
5.1 Lexical variation as diversity
While lexical variation has been investigated in several sign languages to date (e.g., Lucas et al. 2003; McKee & McKee 2011; McKee et al. 2011; Stamp et al. 2014; 2015; Safar 2021; Mudd et al. 2020; Lutzenberger et al. 2023), there had not been any systematic study of it for STS, despite the fact that lexical variation – particularly across age groups and deaf schools – has been known in the community and reported on anecdotally in previous research (e.g., Bergman 1977). In our study, we wanted to explore the degree of lexical variation in STS as potentially relating to sociolinguistic groupings of members of the community. We expected that there might be leveling in lexical variation over time, as has been shown in particular domains for BSL (Stamp et al. 2015) and NZSL (McKee et al. 2011). The motivation behind such leveling would be increased convergence due to changes in deaf schools and the centralization of high school deaf education to a single location (cf. Schönström & Holmström 2021), leading to more uniformity in schools attended and a higher degree of mutual interaction among signers – also boosted by technological advances facilitating communication across communities domestically and internationally. We followed the approach used by Mudd et al. (2020) – related to earlier approaches by Israel & Sandler (2011) – in using entropy as a diversity index for lexical variation. Mudd et al. (2020) did not find any significant differences between potential sociolinguistic groupings (e.g., age, gender and hearing status) in terms of lexical diversity in Kata Kolok. In our study, we found that for both elicitation and online survey data, there is a slight decrease in lexical diversity between the older and the younger age groups for each data collection method (see Figure 5), but that any effect of age group is ultimately not statistically significant in our models (see Figures 6, 7). Mudd et al. (2020) discuss their use of entropy as a diversity index, pointing out that while they, too, saw no statistically significant difference between groups, there are qualitative differences that set the same groups apart. For example, Mudd et al. (2020) noted that within each group, the number and spread of sign variants can be similar to that of the other group – thus having a similar diversity index – but the variants themselves may be qualitatively different, such that each group can be distinguished in terms of which sign variants are observed at all. In our current study, responses are more constrained than the free elicitation by Mudd et al. (2020), since responses are matched to available sign variants found as response options in the survey to maximize consistency across data collections, participants, coding and analysis. As will be discussed below, we do observe age-related differences as patterns of lexical choice preferences.
5.2 Lexical overlap and social profiles
Using age group as a variable influencing lexical variation in terms of diversity, we saw a weak trend of reduced diversity over time, but which could not be supported statistically with our current data. When we looked at the lexical overlap between every combination of participants in our data through pairwise comparisons of their sign choices, there were indications of a cumulative effect of shared sociolinguistic profiles. That is, two participants of similar age, gender and school background are more likely to overlap in their sign choices than participants who are more different – each shared feature results in a slightly higher degree of lexical overlap (see Figure 8). Thus, similar sociolinguistic profiles appear to correlate with similar lexical preferences in the STS community. At the same time, the degree of lexical overlap is very similar across both participants with shared profiles and participants without shared profiles. This leads us to believe – in accordance with our own perception based on observations in the STS community – that much of the lexical variation in STS is restricted to certain lexical items rather than complete sets of parallel signs distinctive to individual groups. Some individual sign variants can be shibboleths of your background – whether regional/school-based, age-related or using older forms inherited from deaf caregivers – but the overall sign repertoire is quite uniform across the community. An important methodological issue to be pointed out here is that the participants in the survey task were explicitly asked to provide the sign variant they use nowadays, which may lead to a higher uniformity than asking for the sign variant they used growing up. It is well known from spoken language linguistics that analyzing language change by comparing age groups at a single point in time (so-called apparent time) is complicated due to individuals also changing their own language during the course of their lifetime (so-called age grading) – see, e.g., Eckert (2017). We nonetheless observed some age-related differences in sign preferences for specific items, as will be discussed in the following.
5.3 Patterns and changes in lexical preferences
Since it has been known in the community that deaf schools have (historically) developed their own unique sign variants for certain meanings, we suspected that there may still be observable differences in sign variant choices today, despite a decline in the number of students in deaf schools in Sweden and more possibilities to meet and interact across regions. Previous research on, e.g., BSL (Stamp et al. 2014; 2015) and NZSL (McKee & McKee 2011; McKee et al. 2011) has shown patterns of more variation and/or higher use of older/traditional sign variants in older age groups of signers. In our study, we focused on some specific phonological types and meanings of signs to investigate whether we can observe changes in preferences over time.
In terms of phonological types of signs, we addressed two specific categories: first, pairs of sign variants differing only in whether they are one- or two-handed (with the two-handed form being a mirrored equivalent of the one-handed form); second, we looked at sign variants differing only in whether they have a repeated movement or not. For the one- vs. two-handed preference, we did not see any significant effect of age group in the elicitation data, but we did see it in the online data, for which the two-handed preference was increasing with each age group. Based on anecdotal observations by us authors, we suspected that younger signers today would use the two-handed forms more than older signers, and this seems to be supported by our online data. One motivation for this could be efficiency, as the two-handed signs could potentially be shorter in duration as the two hands each move a shorter distance simultaneously if mirroring a one-handed form’s path – e.g., each moving from a center point in front of the body outwards, rather than moving all across from the contralateral to the ipsilateral side. Previous research has shown that younger signers articulate signs with a shorter duration and higher signing rate than older signers in STS and other sign languages (see Börstell et al. 2016; 2024), but the phonetics of such differences has not yet been researched for STS. Tentatively, we can only say that there appears to be a pattern of using more two-handed variants for these specific meanings in the younger age groups, but we cannot say whether this preference is due to a general pattern nor whether it is associated with articulatory reduction. Similarly, we also looked at the distribution of repeated vs. non-repeated forms for two meanings (‘green’ and ‘red). Whereas the survey data indicated a decrease in repeated forms across age groups, our additional validation check in the STS Corpus illustrated a much more variable picture, with no trend in either direction. Based on previous research on reduction phenomena within sign language discourse, it has been shown that subsequent occurrences of specific lexical items in the same text can result in form reduction as signs are established and expected (see Lepic 2019; Martinez Del Rio 2023). Thus, while our survey targets the reported preferences in isolation, we expect forms that differ only in parameters such as repetition and number of hands to also vary within signers, as factors such as surprisal and discourse prominence as well as phonological and pragmatic context may influence the choice of sign form. For instance, repeated forms may be more frequent at first mention (full form) than at later mentions (reduced form), and a two-handed form may be more common in a context between two other two-handed signs or when emphasized in discourse. Such questions need to be explored further as sign language corpora grow bigger.
We also investigated whether there is a change in the use of borrowed signs for countries, as has been shown for, e.g., BSL and NZSL (Matthews et al. 2009; Stamp et al. 2014), hypothesized to stem from international influences (see Hiddinga & Crasborn 2011; Kusters 2021) and a conscious move towards respectful sign choices. We, too, see a trend towards a higher use of borrowed signs for countries over time, which is especially clear in the online data. There is variation between different countries across age groups, where some countries are preferentially expressed with an older sign (e.g., ‘Japan’ and ‘Mexico’), whereas others have clearly shifted towards a preference for the borrowed sign across most age groups (e.g., ‘Australia’ and ‘Italy’). There are a few non-country signs that can be observed in the STS community today which are most likely direct borrowings from ASL, similar to findings from NZSL (McKee & McKee 2020; McKee et al. 2021), attributed to language contact and multilingualism (Kusters 2021).9 A future study could address the prevalence of such borrowings.
5.4 Multiple approaches to lexical variation
There are certain methodological concerns to discuss with regard to the data collection. Since the survey data consists of two different collection methods – elicitation and online – we need to be extra cautious in our interpretations of the data. First off, the two methodologies are quite different in terms of the task, since the elicitation requires a free response from text or image stimuli, whereas the online survey already prompts and primes the participants with sign variant options. Due to the challenges with sampling participants across a wide range of age groups, the data collection method also fully aligns with the age distribution of participants: the older age groups did the elicitation task; the younger age groups did the online task. Hence, we took extra steps to ensure consistency and some comparability across the datasets, such as manually linking every response with a unique entry in the STS dictionary (Svenskt teckenspråkslexikon 2024), and only analyzing those that were represented as sign videos in the survey itself. Nonetheless, we analyze our data from the two tasks separately – albeit in parallel – to assume a more conservative interpretation of potential changes over time.
Using corpus data to support lexical variation data can be useful to avoid some of the methodological issues that come with a survey design where only single responses are recorded, and may better reflect actual language use in a more spontaneous, naturalistic setting. This approach was also taken by Stamp et al. (2014), by comparing elicitation responses to sign forms used in conversational corpus data in BSL, showing that about 78% of sign forms used in conversation aligned with those elicited in isolation. What we saw from our corpus validation of repeated vs. non-repeated forms is that the picture is more varied when looking at corpus occurrences. Additionally, as discussed in Section 3.3, the STS Corpus is still very small compared to corpora for spoken languages (whether written or spoken language), and often too small to find enough occurrences to be able to confidently point to systematic differences and patterns of change. While we do find some candidates of age-related preferences in sign variants in the STS Corpus as shown in Figure 17 – see also Börstell & Östling (2016) and Börstell (2024a) – we do not have enough data in terms of coverage and dispersal across corpus participants to estimate the degree of lexical variation on the whole. Nonetheless, we argue that the addition of corpus data to support survey data provides a more nuanced picture of lexical variation, such as multiple sign variants being part of a single signer’s active repertoire, which is less obvious from surveys. Corpus data may thus also point to potential distributional differences of form variants that depend on context and function, for instance whether a sign is articulated with or without repetition, which is not reflected in citation form contexts (cf. Figure 10). Thus, we urge sign language researchers to approach lexical variation from multiple angles to account for some of the complexities in linguistic variation, including within-signer variability, free and contextual variation and to counteract prescriptive attitudes and observer’s paradox.
5.5 Practical and theoretical implications
In this study, we have shown that there are some patterns in the lexical variation observed in STS, but also that there are benefits to using multiple approaches to mitigate methodology-induced differences. For instance, as Stamp et al. (2014) demonstrated for BSL, signers may report one preferential sign form in an elicitation task, but produce different forms in conversational signing. While using corpus data to validate or balance survey data is crucial, it is often practically complicated on a wider scale due to the (currently) small corpora of sign languages. Nevertheless, we suggest that at least attempting to use more naturalistic data is important in order to reflect the variation found in linguistic interaction, which better represents what users produce and are themselves exposed to. One advantage of a survey or interview-based study might be to ask about signs used at some other time (e.g., at adolescence) rather than signs used now, although the accuracy of the responses to such questions may be more uncertain. In our survey, we explicitly asked about signs used nowadays, which should align closer with the synchronic corpus data, but does not take into account changes that may have taken place during the lifetime (cf. Eckert 2017), whether conscious or not.
As noted by Mudd et al. (2020), entropy as a diversity index does not provide the full picture of which sign forms are actually used. We chose to treat all sign variants similarly, regardless if they are so-called phonological variants (differing in a single form parameter) or lexical variants (differing in more form parameters), due to the complex nature of form-continua (see Kimmelman et al. 2022). This choice does, however, mean that we cannot quantify the similarity between individual sign choices, whether on the basis of phonological form or iconicity, factors that have been considered particularly when comparing overlap in signs across sign languages (cf. Ebling et al. 2015). One might hypothesize that sign variants which are far apart (e.g., distinct forms and different iconic mappings) could be less intelligible across the groups that use them, but our observations in the STS community supports what Stamp et al. (2016) found for BSL, in that STS signers do not struggle with understanding sign variants that they do not themselves use (when seen in context), unless it is a lexical innovation (e.g., slang) or borrowing (e.g., signs for countries that are less widespread).
In our survey, we only included sign variants which were already documented in the STS dictionary, as these were represented as sign video response options. This connects the methodology as well as the results to a lexicographic conundrum: deciding which signs should be included in a dictionary and how established they need to be before they are entered as documented lexical items. Since younger generations tend to be innovative with language and use slang etc. (for youth language in the Swedish context, see Kotsinas 2004), younger STS signers are coming up with new signs that they use in their daily lives, but which may not survive long enough or become established to the extent that they are ever entered into a dictionary. Thus, any dictionary-based sampling of lexical items for a survey may underestimate the degree of lexical variation in the language, particularly with regard to youth language and productive innovations. Tracking lexical innovations could potentially benefit from observing videos posted online (cf. Hou et al. 2022). However, while this has been shown to be a highly useful source of data for spoken languages that exist in a written form, for which data can be harvested and aggregated in large numbers to investigate patterns of variation (see, e.g., Grieve et al. 2018), sign language data would be much more challenging to collect in larger quantities for both technical and availability reasons. Thus, it may suffer from the same problems as the currently available sign language corpora, for which few lexical items are frequent enough that variation patterns can be quantified with statistical modeling. Using crowdsourcing as targeted data collection for variation, particularly innovations, can be a useful complement to corpus-based lexicography (see Kankkonen et al. 2018; Wähl et al. 2018; Hanke et al. 2020).
Understanding lexical variation is important also for other areas of research. For instance, having an understanding of the exact sign variants found in a language and to what extent they are understood as part of either a signer’s active or passive vocabulary is crucial for psycholinguistic experiments. Creating lexical comprehension or naming experiments can be complicated for sign languages, due to the high degree of iconicity for many signs, resulting in participants being able to guess the meaning even if a form is new to them and possibly even from a different sign language (cf. Börstell 2023). Knowing the extent to which individual forms are found and understood is thus critical. This extends to the development of pseudoword/pseudosign experiments (cf. Witte et al. 2025), for which it is necessary to know which phonological forms are found anywhere in the entire lexicon of the language, even if archaic or new.
As discussed by Mudd et al. (2020) and Lutzenberger et al. (2023), frequency may be associated with the variability found in a language, which can affect the way lexical diversity is operationalized across meanings – that is, high frequency concepts may exhibit lower variability. This is an important point for evaluating not only variation synchronically but also language change over time, and to estimate the rate of lexical replacement (see, e.g., Vejdemo & Hörberg 2016), something which relies on solid documentation of variation diachronically. Only then is it possible to compare what variation is attributed to larger changes in the language system as a whole vs. changes in production over the lifetime of individuals.
A Appendix
Table 2: List of concepts used in the elicitation and online survey.
| Index | Category | Meaning | Translation | Videos |
| 1 | Color | ORANGE | orange | 4 |
| 2 | Color | LILA | purple | 2 |
| 3 | Color | GRÅ | gray | 2 |
| 4 | Color | BLÅ | blue | 2 |
| 5 | Color | RÖD | red | 2 |
| 6 | Color | GRÖN | green | 3 |
| 7 | Color | TURKOS | turquoise | 3 |
| 8 | Numbers | FEMTIO (50) | fifty | 1 |
| 9 | Numbers | TRETTON (13) | thirteen | 4 |
| 10 | Numbers | TJUGO (20) | twenty | 1 |
| 11 | Numbers | TIO (10) | ten | 2 |
| 12 | Numbers | ETT TUSEN (1000) | one thousand | 1 |
| 13 | Numbers | NIO (9) | nine | 1 |
| 14 | Numbers | NITTON (19) | nineteen | 6 |
| 15 | Numbers | TJUGOÅTTA (28) | twenty-eight | 1 |
| 16 | Numbers | ETT HUNDRA (100) | one hundred | 1 |
| 17 | Numbers | SEXTON (16) | sixteen | 4 |
| 18 | Numbers | NITTIO (90) | ninety | 2 |
| 19 | Country | ISRAEL | Israel | 3 |
| 20 | Country | MEXICO | Mexico | 5 |
| 21 | Country | ITALIEN | Italy | 4 |
| 22 | Country | RYSSLAND | Russia | 2 |
| 23 | Country | AUSTRALIEN | Australia | 4 |
| 24 | Country | JAPAN | Japan | 4 |
| 25 | Country | SERBIEN | Serbia | 2 |
| 26 | City | HAPARANDA | Haparanda | 4 |
| 27 | City | HÄRNÖSAND | Härnösand | 5 |
| 28 | City | STOCKHOLM | Stockholm | 2 |
| 29 | City | ARBOGA | Arboga | 4 |
| 30 | City | ÖREBRO | Örebro | 6 |
| 31 | City | GÖTEBORG | Göteborg | 3 |
| 32 | City | BORÅS | Borås | 6 |
| 33 | City | VÄXJÖ | Växjö | 4 |
| 34 | City | LUND | Lund | 2 |
| 35 | Other | LUCIA | Lucia | 2 |
| 36 | Other | LURA | to trick | 4 |
| 37 | Other | VANLIG | common | 4 |
| 38 | Other | OVANLIG | uncommon | 2 |
| 39 | Other | NYTT | new | 2 |
| 40 | Other | PRINSESSA | princess | 3 |
| 41 | Other | HÄR | here | 1 |
| 42 | Other | VÄLKOMMEN | welcome | 2 |
| 43 | Other | TÅRTA | cake | 2 |
| 44 | Other | ÄMNE/TEMA | topic/theme | 2 |
| 45 | Other | NIVÅ | level | 4 |
Abbreviations
ASL American Sign Language
BSL British Sign Language
DGS German Sign Language (German: Deutsche Gebärdensprache)
NZSL New Zealand Sign Language
STS Swedish Sign Language (Swedish: svenskt teckenspråk)
Data availability
Data and code can be found at: https://osf.io/7pdt5.
Ethics and consent
Participation was voluntary and informed consent was obtained from all participants. Responses were anonymized from the start, with participants’ sociolinguistic profiles aggregated at a level that ensures individual anonymity. The study was judged exempt from formal ethical review, since no personal identifiable information was collected.
Funding information
This study was funded by a small grant from Erik Wellanders fond, Institute for Language and Folklore (Isof; Institutet för språk och folkminnen).
Acknowledgements
We thank all the people who participated in either the online or elicitation-based survey, and three reviewers for their helpful suggestions.
Competing interests
The authors have no competing interests to declare.
Author contributions
The project idea was devised by PSA, who also served as PI and wrote the funding application. The elicitation data was collected mainly by EB and TB, with additional data collected by PSA. The data was processed by CB with support from PSA and TB. Additional manual annotations of the data were done by PSA and CB. Statistical analyses and visualizations were done by CB with support from PSA. The text was originally drafted by CB with support from PSA and later reviewed by all authors.
Notes
- https://teckensprakslexikon.su.se/verktyg/region. [^]
- https://teckensprakslexikon.su.se/verktyg/mindre-vanliga/alderdomligt. [^]
- https://archive.mpi.nl. [^]
- These items were included to get input from signers in the elicitation sessions on which forms they would actually use. [^]
- We calculate the diversity index (entropy) with a base-2 logarithm, in order to follow Mudd et al. (2020). [^]
- The original meaning of ’ten öre’ (öre being the currency subdivision equivalent to cents) is rarely used today – see https://teckensprakslexikon.su.se/ord/04474. There is additionally a bimorphemic sign for ‘ten crowns’, which uses the same handshape as the modern ‘ten’: https://teckensprakslexikon.su.se/ord/19342. [^]
- Depicting signs are potentially less conventionalized, highly iconic constructions used extensively in narrative texts. [^]
- For visualization purposes, we excluded meanings with more than two variants in the corpus. [^]
- An example is the ASL sign for ‘since’ (Hochgesang et al. 2025: https://aslsignbank.com/dictionary/gloss/1066.html) which can occasionally be observed in STS. [^]
References
Adam, Robert. 2015. Standardization of sign languages. Sign Language Studies 15(4). 432–445. DOI: http://doi.org/10.1353/sls.2015.0015
Arel-Bundock, Vincent & Greifer, Noah & Heiss, Andrew. 2024. How to interpret statistical models using marginaleffects for R and Python. Journal of Statistical Software 111(9). 1–32. DOI: http://doi.org/10.18637/jss.v111.i09
Bates, Douglas & Mächler, Martin & Bolker, Ben & Walker, Steve. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. DOI: http://doi.org/10.18637/jss.v067.i01
Bayley, Robert & Schembri, Adam C. & Lucas, Ceil. 2015. Variation and change in sign languages. In Schembri, Adam C. & Lucas, Ceil (eds.), Sociolinguistics and deaf communities, 61–94. Cambridge: Cambridge University Press 1st edn. DOI: http://doi.org/10.1017/CBO9781107280298.004
Becker, Amelia A. & Hochgesang, Julie A. & Tamminga, Meredith & Fisher, Jami N. 2023. Sociophonetics and signed languages. In The Routledge handbook of sociophonetics, 467–488. London: Routledge 1st edn. DOI: http://doi.org/10.4324/9781003034636-25
Bergman, Brita. 1977. Tecknad svenska. Stockholm: Liber.
Bickford, J. Albert. 1991. Lexical variation in Mexican Sign Language. Sign Language Studies 72(Fall 1991). 241–276. DOI: http://doi.org/10.1353/sls.1991.0010
Börstell, Carl. 2022. Introducing the signglossR package. In Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Kristoffersen, Jette & Mesch, Johanna & Schulder, Marc (eds.), Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources. 16–23. Marseille, France: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/22006.pdf.
Börstell, Carl. 2023. Lexical comprehension within and across sign languages of Belgium, China and the Netherlands. Glossa: a journal of general linguistics 8(1). DOI: http://doi.org/10.16995/glossa.9902
Börstell, Carl. 2024a. How to approach lexical variation in sign language corpora. In Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Mesch, Johanna & Schulder, Marc (eds.), Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, 222–229. Torino, Italy: ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL). https://www.sign-lang.uni-hamburg.de/lrec/pub/24026.pdf.
Börstell, Carl. 2024b. swemapdata: A package of spatial data for Sweden – regions and cities. https://github.com/borstell/swemapdata. R package version 0.0.2.
Börstell, Carl. 2024c. tidysigns: Sign language corpus data in a tidy format. https://github.com/borstell/tidysigns. R package version 0.2.2.
Börstell, Carl & Hörberg, Thomas & Östling, Robert. 2016. Distribution and duration of signs and parts of speech in Swedish Sign Language. Sign Language & Linguistics 19(2). 143–196. DOI: http://doi.org/10.1075/sll.19.2.01bor
Börstell, Carl & Östling, Robert. 2016. Visualizing lects in a sign language corpus: Mining lexical variation data in lects of Swedish Sign Language. In Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Kristoffersen, Jette & Mesch, Johanna (eds.), Proceedings of the LREC2016 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining. 13–18. Portorož, Slovenia: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/16004.pdf.
Börstell, Carl & Schembri, Adam & Crasborn, Onno. 2024. Sign duration and signing rate in British Sign Language, Dutch Sign Language and Swedish Sign Language. Glossa Psycholinguistics 3(1). DOI: http://doi.org/10.5070/G60111915
Brentari, Diane. (ed.) 2010. Sign languages: A Cambridge language survey. New York, NY: Cambridge University Press. DOI: http://doi.org/10.1017/CBO9780511712203
Chen, Yaqing & Gong, Qunhu. 2020. Dialects or languages: A corpus-based quantitative approach to lexical variation in common signs in Chinese Sign Language (CSL). Lingua 248. 102944. DOI: http://doi.org/10.1016/j.lingua.2020.102944
Cormier, Kearsy & Fenlon, Jordan & Johnston, Trevor & Rentelis, Ramas & Schembri, Adam & Rowley, Katherine & Adam, Robert & Woll, Bencie. 2012. From corpus to lexical database to online dictionary: Issues in annotation of the BSL Corpus and the development of BSL SignBank. In Crasborn, Onno & Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Kristoffersen, Jette & Mesch, Johanna (eds.), Proceedings of the LREC2012 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, 7–12. Istanbul, Turkey: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/12033.pdf.
Duggan, Nora & Holmström, Ingela & Schönström, Krister. 2023. Translanguaging practices in adult education for deaf migrants. DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada 39(1). 202359764. DOI: http://doi.org/10.1590/1678-460x202359764
Ebling, Sarah & Konrad, Reiner & Braem, Penny Boyes & Langer, Gabriele. 2015. Factors to consider when making lexical comparisons of sign languages: Notes from an ongoing comparison of German Sign Language and Swiss German Sign Language. Sign Language Studies 16(1). 30–56. DOI: http://doi.org/10.1353/sls.2015.0024
Eckert, Penelope. 2012. Three waves of variation study: The emergence of meaning in the study of sociolinguistic variation. Annual Review of Anthropology 41(1). 87–100. DOI: http://doi.org/10.1146/annurev-anthro-092611-145828
Eckert, Penelope. 2017. Age as a sociolinguistic variable. In Coulmas, Florian (ed.), The handbook of sociolinguistics, 151–167. Wiley 1st edn. DOI: http://doi.org/10.1002/9781405166256.ch9
Fenlon, Jordan & Hochgesang, Julie A. (eds.) 2022. Signed language corpora (Sociolinguistics in Deaf Communities 25). Washington, DC: Gallaudet University Press. DOI: http://doi.org/10.2307/j.ctv2rcnfhc
Grieve, Jack & Nini, Andrea & Guo, Diansheng. 2018. Mapping lexical innovation on American social media. Journal of English Linguistics 46(4). 293–319. DOI: http://doi.org/10.1177/0075424218793191
Hanke, Thomas. 2016. Towards a visual sign language corpus linguistics. In Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Kristoffersen, Jette & Mesch, Johanna (eds.), Proceedings of the LREC2016 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining. 89–92. Portorož, Slovenia: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/16024.pdf.
Hanke, Thomas & Jahn, Elena & Wähl, Sabrina & Böse, Oliver & König, Lutz. 2020. SignHunter – A sign elicitation tool suitable for deaf events. In Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Kristoffersen, Jette & Mesch, Johanna (eds.), Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives. 83–88. Marseille, France: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/20030.pdf.
Hanke, Thomas & Konrad, Reiner & Langer, Gabriele. 2023. Exploring regional variation in the DGS Corpus. In Wehrmeyer, Ella (ed.), Studies in Corpus Linguistics, vol. 108, 192–218. Amsterdam: John Benjamins Publishing Company. DOI: http://doi.org/10.1075/scl.108.07han
Hester, Jim & Bryan, Jennifer. 2024. glue: Interpreted string literals. https://CRAN.R-project.org/package=glue. R package version 1.8.0.
Hiddinga, Anja & Crasborn, Onno. 2011. Signed languages and globalization. Language in Society 40(4). 483–505. DOI: http://doi.org/10.1017/S0047404511000480
Hochgesang, Julie A. & Crasborn, Onno & Lillo-Martin, Diane. 2025. ASL Signbank. Haskins Lab, Yale University New Haven, CT. https://aslsignbank.com.
Horton, Laura. 2022. Lexical overlap in young sign languages from Guatemala. Glossa: a journal of general linguistics 7(1). DOI: http://doi.org/10.16995/glossa.5829
Hou, Lynn & de Vos, Connie. 2022. Classifications and typologies: Labeling sign languages and signing communities. Journal of Sociolinguistics 26(1). 118–125. DOI: http://doi.org/10.1111/josl.12490
Hou, Lynn & Lepic, Ryan & Wilkinson, Erin. 2022. Managing sign language video data collected from the internet. In Berez-Kroeker, Andrea L. & McDonnell, Bradley & Koller, Eve & Collister, Lauren B. (eds.), The open handbook of linguistic data management, 471–480. The MIT Press. DOI: http://doi.org/10.7551/mitpress/12200.003.0045
Israel, Assaf & Sandler, Wendy. 2011. Phonological category resolution in a new sign language: A comparative study of handshapes. In Channon, Rachel & van der Hulst, Harry (eds.), Formational units in sign languages, 177–202. Berlin/Boston, MA: De Gruyter Mouton. DOI: http://doi.org/10.1515/9781614510680.177
Johnston, Trevor & Schembri, Adam. 1999. On defining lexeme in a signed language. Sign Language & Linguistics 2(2). 115–185. DOI: http://doi.org/10.1075/sll.2.2.03joh
Kankkonen, Nikolaus Riemer & Björkstrand, Thomas & Mesch, Johanna & Börstell, Carl. 2018. Crowdsourcing for the Swedish Sign Language Dictionary. In Bono, Mayumi & Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Kristoffersen, Jette & Mesch, Johanna & Osugi, Yutaka (eds.), Proceedings of the LREC2018 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community. 171–176. Miyazaki, Japan: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/18022.pdf.
Kay, Matthew. 2023. ggdist: Visualizations of distributions and uncertainty in the grammar of graphics. IEEE Transactions on Visualization and Computer Graphics, 1–11. DOI: http://doi.org/10.1109/TVCG.2023.3327195
Kimmelman, Vadim & Komarova, Anna & Luchkova, Lyudmila & Vinogradova, Valeria & Alekseeva, Oksana. 2022. Exploring networks of lexical variation in Russian Sign Language. Frontiers in Psychology 12. 740734. DOI: http://doi.org/10.3389/fpsyg.2021.740734
Konrad, Reiner. 2013. The lexical structure of German Sign Language (DGS) in the light of empirical LSP lexicography: On how to integrate iconicity in a corpus-based lexicon model. Sign Language & Linguistics 16(1). 111–118. DOI: http://doi.org/10.1075/sll.16.1.07kon
Kopf, Maria & Schulder, Marc & Hanke, Thomas. 2021. Overview of datasets for the sign languages of Europe. DOI: http://doi.org/10.25592/UHHFDM.9560
Kotsinas, Ulla-Britt. 2004. Ungdomsspråk. Uppsala: Hallgren & Fallgren 3rd edn.
Kusters, Annelies. 2021. International Sign and American Sign Language as different types of global deaf lingua francas. Sign Language Studies 21(4). 391–426. DOI: http://doi.org/10.1353/sls.2021.0005
Labov, William. 1972. Sociolinguistic patterns (Conduct and Communication 4). Philadelphia, PA: University of Pennsylvania Press.
Langer, Gabriele. 2012. A colorful first glance at data on regional variation extracted from the DGS-Corpus: With a focus on procedures. In Crasborn, Onno & Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Kristoffersen, Jette & Mesch, Johanna (eds.), Proceedings of the LREC2012 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon. 101–108. Istanbul, Turkey: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/12017.pdf.
LeMaster, Barbara. 2000. Reappropriation of gendered Irish Sign Language in one family. Visual Anthropology Review 15(2). 69–83. DOI: http://doi.org/10.1525/var.2000.15.2.69
Lepic, Ryan. 2019. A usage-based alternative to “lexicalization” in sign language linguistics. Glossa: a journal of general linguistics 4(1). 23. DOI: http://doi.org/10.5334/gjgl.840
Lucas, Ceil. 2003. The role of variation in lexicography. Sign Language Studies 3(3). 322–340. DOI: http://doi.org/10.1353/sls.2003.0009
Lucas, Ceil & Bayley, Robert & Valli, Clayton. 2003. What’s your sign for PIZZA?: An introduction to variation in American Sign Language. Washington, DC: Gallaudet University Press.
Lucas, Ceil & Bayley, Robert & Valli, Clayton. 2009. Sociolinguistic variation in American Sign Language. Gallaudet University Press. DOI: http://doi.org/10.2307/j.ctv2rh2959
Lutzenberger, Hannah & de Vos, Connie & Crasborn, Onno & Fikkert, Paula. 2021. Formal variation in the Kata Kolok lexicon. Glossa: a journal of general linguistics 6(1). DOI: http://doi.org/10.16995/glossa.5880
Lutzenberger, Hannah & Mudd, Katie & Stamp, Rose & Schembri, Adam. 2023. The social structure of signing communities and lexical variation: A cross-linguistic comparison of three unrelated sign languages. Glossa: a journal of general linguistics 8(1). DOI: http://doi.org/10.16995/glossa.10229
Martinez Del Rio, Aurora. 2023. Repetition reduction across the lexicon of American Sign Language: University of Chicago dissertation. DOI: http://doi.org/10.6082/UCHICAGO.7634
Massicotte, Philippe & South, Andy. 2025. rnaturalearth: World map data from Natural Earth. DOI: http://doi.org/10.32614/CRAN.package.rnaturalearth. https://CRAN.R-project.org/package=rnaturalearth. R package version 1.1.0
Matthews, Philip W. & McKee, Rachel L. & McKee, David. 2009. Signed languages, linguistic rights and the standardization of geographical names. In Ahrens, Wolfgang & Embleton, Sheila & Lapierre, André (eds.), Names in multi-lingual, multi-cultural and multi-ethnic contact: Proceedings of the 23rd International Congress of Onomastic Sciences August 17–22, 2008, York University Toronto, Canada. 721–732. Toronto: York University.
McBurney, Susan. 2012. History of sign languages and sign language linguistics. In Pfau, Roland & Steinbach, Markus & Woll, Bencie (eds.), Sign language: An international handbook, 909–948. Berlin/Boston, MA: Walter de Gruyter. DOI: http://doi.org/10.1515/9783110261325.909
McKee, David & McKee, Rachel & Major, George. 2011. Numeral variation in New Zealand Sign Language. Sign Language Studies 12(1). 72–97. DOI: http://doi.org/10.1353/sls.2011.0015
McKee, Rachel & McKee, David. 2011. Old signs, new signs, whose signs?: Sociolinguistic variation in the NZSL lexicon. Sign Language Studies 11(4). 485–527. DOI: http://doi.org/10.1353/sls.2011.0012
McKee, Rachel & McKee, David. 2020. Globalization, hybridity, and vitality in the linguistic ideologies of New Zealand Sign Language users. Language & Communication 74. 164–181. DOI: http://doi.org/10.1016/j.langcom.2020.07.001
McKee, Rachel & Vale, Mireille & Alexander, Sara Pivac & McKee, David. 2021. Signs of globalization: ASL influence in the lexicon of New Zealand Sign Language. Sign Language Studies 22(2). 283–319. DOI: http://doi.org/10.1353/sls.2021.0022
Meir, Irit & Israel, Assaf & Sandler, Wendy & Padden, Carol A. & Aronoff, Mark. 2012. The influence of community on language structure: Evidence from two young sign languages. Linguistic Variation 12(2). 247–291. DOI: http://doi.org/10.1075/lv.12.2.04mei
Mesch, Johanna & Wallin, Lars & Nilsson, Anna-Lena & Bergman, Brita. 2012. Dataset. Swedish Sign Language Corpus project 2009–2011 (version 1). Department of Linguistics, Stockholm University. https://teckensprakskorpus.su.se.
Mudd, Katie & Lutzenberger, Hannah & de Vos, Connie & Fikkert, Paula & Crasborn, Onno & De Boer, Bart. 2020. The effect of sociolinguistic factors on variation in the Kata Kolok lexicon. Asia-Pacific Language Variation 6(1). 53–88. DOI: http://doi.org/10.1075/aplv.19009.mud
Müller, Kirill. 2020. here: A simpler way to find your files. https://CRAN.R-project.org/package=here. R package version 1.0.1.
Österberg, Oskar. 1916. Teckenspråket. Uppsala: P. Alfr. Persons förlag.
Palfreyman, Nick. 2019. Variation in Indonesian Sign Language: A typological and sociolinguistic analysis. Berlin/Boston, MA: De Gruyter Mouton. DOI: http://doi.org/10.1515/9781501504822
Pebesma, Edzer & Bivand, Roger. 2023. Spatial data science: With applications in R. New York, NY: Chapman and Hall/CRC 1st edn. DOI: http://doi.org/10.1201/9780429459016
Power, Justin M. 2022. Historical linguistics of sign languages: Progress and problems. Frontiers in Psychology 13. 818753. DOI: http://doi.org/10.3389/fpsyg.2022.818753
Quinto-Pozos, David & Adam, Robert. 2015. Sign languages in contact. In Schembri, Adam C. & Lucas, Ceil (eds.), Sociolinguistics and deaf communities, 29–60. Cambridge: Cambridge University Press 1st edn. DOI: http://doi.org/10.1017/CBO9781107280298.003
R Core Team. 2025. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Safar, Josefina. 2021. What’s your sign for TORTILLA? Documenting lexical variation in Yucatec Maya Sign Languages. Language Documentation & Conservation 15. 30–74. http://hdl.handle.net/10125/24970.
Safar, Josefina & Guen, Olivier Le & Collí, Geli Collí & Hau, Merli Collí. 2018. Numeral variation in Yucatec Maya Sign Languages. Sign Language Studies 18(4). 488–516. DOI: http://doi.org/10.1353/sls.2018.0014
Sagara, Keiko & Zeshan, Ulrike. 2016. Semantic fields in sign languages – a comparative typological study. In Zeshan, Ulrike & Sagara, Keiko (eds.), Semantic fields in sign languages, 3–38. Berlin/Boston, MA: De Gruyter. DOI: http://doi.org/10.1515/9781501503429-001
Sandler, Wendy & Aronoff, Mark & Meir, Irit & Padden, Carol. 2011. The gradual emergence of phonological form in a new language. Natural Language & Linguistic Theory 29(2). 503–543. DOI: http://doi.org/10.1007/s11049-011-9128-2
Schermer, Trude. 2004. Lexical variation in Sign Language of the Netherlands. In van Herreweghe, Mieke & Vermeerbergen, Myriam (eds.), To the lexicon and beyond: Sociolinguistics in European deaf communities, 91–110. Washington, D.C.: Gallaudet University Press. DOI: http://doi.org/10.2307/j.ctv2rh28cx.9
Schönström, Krister & Holmström, Ingela. 2021. Four decades of sign bilingual schools in Sweden: From acclaimed to challenged. In Snoddon, Kristin & Weber, Joanne (eds.), Critical perspectives on plurilingualism in deaf education, 15–34. Multilingual Matters. DOI: http://doi.org/10.21832/9781800410756-004
Shannon, C. E. 1948. A mathematical theory of communication. Bell System Technical Journal 27(3). 379–423. DOI: http://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Stamp, Rose & Schembri, Adam & Evans, Bronwen G. & Cormier, Kearsy. 2016. Regional sign language varieties in contact: Investigating patterns of accommodation. Journal of Deaf Studies and Deaf Education 21(1). 70–82. DOI: http://doi.org/10.1093/deafed/env043
Stamp, Rose & Schembri, Adam & Fenlon, Jordan & Rentelis, Ramas. 2015. Sociolinguistic variation and change in British Sign Language number signs: Evidence of leveling? Sign Language Studies 15(2). 151–181. DOI: http://doi.org/10.1353/sls.2015.0001
Stamp, Rose & Schembri, Adam & Fenlon, Jordan & Rentelis, Ramas & Woll, Bencie & Cormier, Kearsy. 2014. Lexical variation and change in British Sign Language. PLoS ONE 9(4). DOI: http://doi.org/10.1371/journal.pone.0094053
Stokoe, William C. 1960. Sign language structure: An outline of the visual communication system of the American Deaf. In Studies in linguistics: Occasional papers (No. 8). Buffalo, NY: Dept. of Anthropology and Linguistics, University of Buffalo.
Stokoe, William C. & Casterline, Dorothy C. & Croneberg, Carl G. 1965. A dictionary of American Sign Language on linguistic principles. Silver Spring, MD: Linstok Press.
Svenskt teckenspråkslexikon. 2024. Swedish Sign Language Dictionary online. Department of Linguistics, Stockholm University. https://teckensprakskorpus.su.se.
Vanhecke, Eline & De Weerdt, Kristof. 2004. Regional variation in Flemish Sign Language. In Van Herreweghe, Mieke & Vermeerbergen, Myriam (eds.), To the lexicon and beyond: Sociolinguistics in European Deaf communities, 27–38. Washington, DC: Gallaudet University Press. DOI: http://doi.org/10.2307/j.ctv2rh28cx.6
Vejdemo, Susanne & Hörberg, Thomas. 2016. Semantic factors predict the rate of lexical replacement of content words. PLoS ONE 11(1). e0147924. DOI: http://doi.org/10.1371/journal.pone.0147924
Vermeerbergen, Myriam & Twilhaar, Jan Nijen & Herreweghe, Mieke Van. 2013. Variation between and within Sign Language of the Netherlands and Flemish Sign Language. In Hinskens, Frans & Taeldeman, Johan (eds.), Dutch, vol. 3 (Handbooks of Linguistics and Communication Science [HSK] 30/3), 680–699. De Gruyter Mouton. DOI: http://doi.org/10.1515/9783110261332.680
Wähl, Sabrina & Langer, Gabriele & Müller, Anke. 2018. Hand in hand – using data from an online survey system to support lexicographic work. In Bono, Mayumi & Efthimiou, Eleni & Fotinea, Stavroula-Evita & Hanke, Thomas & Hochgesang, Julie A. & Kristoffersen, Jette & Mesch, Johanna & Osugi, Yutaka (eds.), Proceedings of the LREC2018 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community, 199–206. Miyazaki, Japan: European Language Resources Association (ELRA). https://www.sign-lang.uni-hamburg.de/lrec/pub/18025.pdf.
Wickham, Hadley & Averick, Mara & Bryan, Jennifer & Chang, Winston & McGowan, Lucy D’Agostino & François, Romain & Grolemund, Garrett & Hayes, Alex & Henry, Lionel & Hester, Jim & Kuhn, Max & Pedersen, Thomas Lin & Miller, Evan & Bache, Stephan Milton & Müller, Kirill & Ooms, Jeroen & Robinson, David & Seidel, Dana Paige & Spinu, Vitalie & Takahashi, Kohske & Vaughan, Davis & Wilke, Claus & Woo, Kara & Yutani, Hiroaki. 2019. Welcome to the tidyverse. Journal of Open Source Software 4(43). 1686. DOI: http://doi.org/10.21105/joss.01686
Wickham, Hadley & Bryan, Jennifer. 2025. readxl: Read Excel files. https://CRAN.R-project.org/package=readxl.
Wickham, Hadley & Pedersen, Thomas Lin & Seidel, Dana. 2025. scales: Scale functions for visualization. DOI: http://doi.org/10.32614/CRAN.package.scales. https://CRAN.R-project.org/package=scales. R package version 1.4.0
Wilke, Claus O. & Wiernik, Brenton M. 2022. ggtext: Improved text rendering support for ‘ggplot2’. https://CRAN.R-project.org/package=ggtext. R package version 0.1.2.
Witte, Erik & Björkstrand, Thomas & Schönström, Krister & Danielsson, Henrik & Holmer, Emil. 2025. A Swedish Sign Language database of video-recorded sign–pseudosign pairs with matching neighborhood density and phonotactic probability. Journal of Speech, Language, and Hearing Research 68(7). 3291–3304. DOI: http://doi.org/10.1044/2025_JSLHR-24-00761
Wittenburg, Peter & Brugman, Hennie & Russel, Albert & Klassmann, Alex & Sloetjes, Han. 2006. ELAN: A professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), 1556–1559. https://aclanthology.org/L06-1082/.
Zeshan, Ulrike & Sagara, Keiko. (eds.) 2016. Semantic fields in sign languages: Colour, kinship and quantification. Berlin/Boston, MA: De Gruyter. DOI: http://doi.org/10.1515/9781501503429
















