Publisher’s note: The original publication contained the incorrect supplementary file. This version of the supplementary file was published on 03/03/2020.
Personal pronouns in German are ambiguous with a weak preference to refer to the most prominent antecedent in the discourse, whereas demonstrative pronouns strongly prefer a less prominent antecedent; here the prominence could be because the antecedent is the subject of the clause (Bosch et al. 2007; Kaiser 2011), agent of the clause (Schumacher et al. 2017), topic of the discourse (Bosch & Umbach 2007; Hinterwimmer 2015) or perspectival center of the narration (Hinterwimmer & Bosch 2018). For example, in (1), the personal pronoun er has a weak preference to refer to Peter, the most prominent referent being the subject (and also the agent and the topic) of the clause, but the demonstrative pronoun der clearly refers to Paul, the less prominent referent being the object of the clause.
This contrastive behavior of German personal and demonstrative pronouns has been studied through corpus, behavioral and ERP studies, as well as from a theoretical perspective. Based on corpus and experimental studies, Bosch et al. (2007) claimed that German demonstrative pronouns avoid subjects as antecedents. However, later Bosch & Umbach (2007), Hinterwimmer (2015) and Bosch & Hinterwimmer (2016) argued that under the circumstances where subject- and topichood diverge, it is in fact (discourse or aboutness) topic that is avoided and not the subject (see also Ellert 2013 for eye-tracking studies reporting influence of information structure on personal and demonstrative pronouns). On the other hand, Schumacher et al. (2015), Schumacher et al. (2016) and Schumacher et al. (2017) showed using various experiments that agentivity is also a crucial factor in determining the antecedent of the demonstrative pronoun, and in fact subjecthood might be an epiphenomenon because in many cases subjecthood and agenthood overlap. And finally, in a recent work, Hinterwimmer et al. (2019) have provided more experimental evidence and suggested a modified version of Hinterwimmer & Bosch (2018) that encompasses the existing results. Their account suggest that demonstrative pronouns generally avoid the most prominent discourse referents as antecedents or binders. Interestingly, Bosch et al. (2003) had hypothesized something similar — demonstrative pronouns prefer less salient referents — and had provided initial support to their hypothesis by analyzing the NEGRA corpus of written German.
The emphasis of these studies has been on the demonstratives from the die-paradigm: die (feminine), der (masculine) and das (neuter). However, German also has another set of demonstratives, the diese-paradigm:1diese (feminine), dieser (masculine) and dieses (neuter) which have rarely been studied until now. Diese-demonstratives are regarded as proximal demonstratives with their distal counterparts being the jene-demonstratives (Weinrich 1993). Since die-demonstratives avoid the most prominent referent, in the simplest case where the pronoun and its antecedent are in the same sentence, and the sentence is in canonical word order (SVO) with an accusative verb, this constraint translates to subject-avoidance. Being one type of demonstratives, it is conceivable that diese-demonstratives obey the prominence-avoidance constraint.2
Fuchs & Schumacher (2018) have recently claimed that although both demonstratives function as attention orienting devices, diese-demonstratives have stronger potential to shift attention. Historically, diese-demonstratives have been claimed to prefer the last mentioned entity as the antecedent (Zifonun et al. 1997; for a similar linear order antecedent preference for the demonstrative tämä in Finnish see Kaiser & Trueswell 2008). For example, in (2) from Zifonun et al. (1997) the personal pronoun er and the demonstrative der refer to Peter but the demonstrative dieser prefers the last mentioned Benz and hence the sentence sounds implausible. On the other hand, German native speakers intuitively associate diese-demonstratives with the formal language register (or style). But overall there has been lack of experimental evidence supporting either of these judgments about diese-demonstratives. The goal of this paper is to experimentally study the constraints on diese-demonstratives and, at the same time, contrast them with the known constraints on die-demonstratives. Essentially we want to contrast the antecedent preference for the two demonstratives in formal and informal language.
Since some German native speakers express the intuition that diese-demonstratives are used more often in formal texts such as in legal documents, we, apart from testing linguistic constraints, also wanted to test if variation in the formality of language use has any effect on the way diese-demonstratives are used. In the field of sociolinguistic variation, formality is considered as the main scale or the most important dimension along which the language register or style varies (Biber 1988; Heylighen & Dewaele 2002); formal and colloquial being the two extremes of the scale. Style is treated as a speaker internal constraint because the same speaker may use different styles depending on the internal or external conditions for language use, and a native speaker of a language is considered to be aware of variations in the language through formality:
“We may try to relate the level of formality chosen to a variety of factors: the kind of occasion; the various social, age, and other differences that exist between the participants; the particular task that is involved, e.g., writing or speaking; the emotional involvement of one or more of the participants; and so on. We appreciate that such distinctions exist when we recognize the stylistic appropriateness of What do you intend to do, your majesty? and the inappropriateness of Waddya intend doin’, Rex? [both emphases are from the original text]. While it may be difficult to characterize discrete levels of formality, it is nevertheless possible to show that native speakers of all languages control a range of stylistic varieties. It is also quite possible to predict with considerable confidence the stylistic features that a native speaker will tend to employ on certain occasions.” (Wardhaugh 2006: 51)
It has also been shown with empirical work that language users are aware of variations in formality at the sentence level (Lahiri et al. 2011) and also across different types of texts (Pavlick & Tetreault 2016). To quantify formality, Heylighen & Dewaele (2002) have proposed a measure, the Formality Score (F-score), based on the frequency of occurrence of various word categories in a document. They have shown that the F-score can be used to adequately classify documents into different genre across seven different languages. In applied domains such as natural language processing and artificial intelligence, various quantitative measures of formality have been used to automatically classify documents (Abu Sheikha & Inkpen 2010; Pavlick & Tetreault 2016).
On the experimental side, Ricks (2018) reports results from a questionnaire study which shows how the variation in register — formal, informal and ethnic — by electoral candidate can shape political opinions of voters. In their questionnaire, they played short audio excerpts of a political address to participants and asked them to rate their agreeability on a five-point Likert scale followed by open-ended questions about virtuousness of the speaker as a member of the parliament. The excerpts were either in formal Thai, informal Thai or in the Isan ethnic language. They found that the ethnic variety lead to stronger agreeability than the formal variety, and the informal and ethnic variations gave rise to higher relatedness with the speaker. At the perception level, language formality has also been shown to improve participants’ attention during online experiments (August & Reinecke 2019). In another exploratory sociophonetic study, Winter & Grawunder (2012) show that language formality modulates a number of vocal expressions in Korean. They show that when Korean speakers were asked to produce messages varying in formality of the situation, the speakers modulate their average fundamental frequency, pitch range, the speed of their speech and breath intake depending on the formality of the situation.
Interestingly, linguistic research with register has shown that register variation can license some syntactic constructions that are normally regarded as ungrammatical. For example, subject drop in (3b), although considered ungrammatical in usual speech, is grammatical in reduced written register such as recipes (Haegeman 2013). Such register specific subject drop in English and French is also proposed to be analysed using core grammatical principles (Haegeman 2013), whereas, register specific object drop in English is proposed to be analysed by assuming register specific lexicon (Weir 2017). However, linguistic research on register is mostly confined to formal analyses. We are not aware of any experimental work where formality or other dimension of language register is explicitly manipulated to test if it induced differential linguistic responses from native speakers. For our research goal, it is necessary that we explicitly vary formality and test if it has any effect on the acceptability of diese-demonstratives.
|(3)||a.||This dish serves four people.|
|b.||*___Serves four people.|
In our experiments, to manipulate formality, we operationalized language register by exposing native speakers to text written in either formal or informal language. If the intuitions of native speakers about the “formality” of German diese-demonstratives are reliable, we expect to see its effect in their judgments. Since, the goal of this paper is also to test if the subject-avoidance constraint observed for relatively well studied die-demonstratives holds for diese-demonstratives, the other manipulation was in terms of available antecedents. Eventually it would be interesting to find out if the prominence-avoidance constraint on die-demonstratives is also applicable to diese-demonstratives, but here we restrict ourselves only to one of the prominence-lending cues, namely subjecthood.
In the next section we report two forced-choice studies that test the effect of language register between experiments and compare subject vs. object antecedent preference of the two demonstratives within each experiment. In light of the results from these two experiments, we motivate a design for another forced-choice study that evaluates the claim from Zifonun et al. (1997) that diese-demonstratives prefer the last mentioned entity as the antecedent.
Here we report two experiments together since they are very closely related with only the difference of language registers used in them. With these two experiments we tested German native speakers’ preferences in producing diese-demonstratives, die-demonstratives and personal pronouns in a syntactically bound configuration. Since personal pronouns are the most frequent and least restricted pronouns, they provided the baseline for various comparisons. Experiment 1a used the formal and Experiment 1b used the informal language register. In essence, we wanted to test: (i) Do diese-demonstratives prefer formal language over informal language? (ii) Does the subject-avoidance constraint observed for die-demonstratives apply to diese-demonstratives?
In both experiments participants were presented with single sentences where the place for the pronoun was left empty, which participants were instructed to fill by selecting one of the options from a drop-down menu, a forced-choice task. The options were two of the pronouns from the three types of pronouns we are considering here and weder noch ‘neither’ as the third option. Unlike the conventional two-alternative forced choice, the third option was included as a way for participants to convey disapproval of either of the pronouns. Since we expected that both demonstratives will be perceived illicit with subject antecedents the option of rejecting both options was necessary, especially in the comparison where both pronoun alternatives were demonstrative pronouns (see comparison C below).
We employed the forced-choice task for our intended comparisons because apart from the ease of deploying the experiment, “The second benefit of FC tasks is increased statistical power to detect differences between conditions […] FC tasks are the only task explicitly designed for the comparison of two (or more) conditions; the other tasks compare conditions indirectly through a response scale (either yes-no, or a numerical scale)” (Schütze & Sprouse 2014). Since we want to directly compare the preference between these pronouns and also across two language registers the forced-choice methodology fits our requirements well. Schütze & Sprouse (2014) also point out as one of the limitations that “the [forced-choice] task provides no information about where a given sentence stands on the overall scale of acceptability”. Since our sentences also include the option of having a personal pronoun instead of the demonstratives, we have a baseline to compare to and indirectly infer the “acceptability” of sentences with diese-demonstratives — how more/less preferred are demonstratives compared to the least marked option?
Forty-four native speakers of German were recruited in each experiment through Osnabrück University›s student mailing list and Facebook posts (Expt. 1a: 36 female, 1 with unspecified gender, mean age = 22.8 years, age range = 18–33 years. Expt. 1b: 38 female, mean age = 23.9 years, age range = 18–50 years). All participants were entered into a lottery that enabled one of them to win 25 euros (for participation during the first week) or one more to win 10 euros (for later participation).
Participants were shown sentences such as (4) in Expt. 1a and (5) in Expt. 1b. Each sentence consisted of a verb of communication that took a subject, an object and a complement clause (dass ‘that’ being the complementizer) as its arguments. The pronoun occurred as the subject of the complement clause and it either referred to the matrix subject or the object. So, effectively, we had antecedent (subject vs. object) as a within experiment factor and register (formal vs. informal) as a between experiment factor, across three comparisons: (A) personal pronoun vs. die-demonstrative, (B) personal pronoun vs. diese-demonstrative, and (C) die- vs. diese-demonstrative. There were 12 experimental items randomly interspersed with 26 filler items across two lists for each experiment. As in (4), the items and corresponding fillers for Expt. 1a were constructed in formal register; similarly, as in (5), the items and corresponding fillers for Expt. 1b were constructed in informal register. The items are listed in the Appendix. The experimental items were counterbalanced for the gender of the possible pronouns (half feminine and half masculine). The antecedent of the possible pronoun was always unambiguous because only one of the subject or object in the sentence matched the gender of the pronoun. The fillers were constructed such that either of the two options (apart from the ‘neither’ option) was grammatically correct but differently acceptable depending on the situation; this care was taken so that they matched the design of our experimental items. The selection options in the fillers were of three types: (i) long vs. short forms of various nouns (e.g. Universität vs. Uni, ‘university’ vs. ‘uni’), (ii) formal vs. colloquial usage (e.g. Polizeibeamte vs. Bulle, ‘police officer’ vs. ‘cop’), and (iii) genitive vs. dative (e.g. des Vortrags vs. von dem Vortrag, ‘the lecture.GEN’ vs. ‘(of) the lecture.DAT’). All fillers also had ‘neither’ as the third option.
Before the experiment began, participants were given instructions which could be translated to English as: ‘In the following, you will see 38 individual sentences, each of which has lost a part. You are asked to complete as much as possible in such a way that the sentence is restored, as it was probably originally written. To do this, from the drop-down menu, select the expression that you think was originally in the text. Note that this is not about grammatical correctness, but rather about stylistic consistency.’ For stylistic consistency they were shown, as an example, a paragraph at the beginning of the experiment. The paragraph had either formal or informal text depending on the experiment. The text for Expt. 1a described a court trial in a formal style, whereas in the text for Expt. 1b a (presumably) young student colloquially describes interactions between her/his peers at a party s/he attended the previous night. The example texts are listed in the Appendix. The experiments were run on the online survey platform SoSci Survey (https://www.soscisurvey.de/) through a single participation URL. Participants were automatically and randomly assigned to one of the two experiments.
All data processing and analyses were carried out in R (R Core Team 2018). We used the Bayesian framework for data analysis. Carrying out data analysis in the Bayesian framework has many advantages over the frequentist framework (please see Vasishth et al. (2018) for detailed reasoning about it). For us the two main advantages were quantifying the uncertainty about the effects through 95% credible intervals around the estimates and the ease of fitting complex models (frequentist models that are fit using tools such as lme4, at times, end up not converging when the model gets complex whereas Bayesian models always converge).
We fit three Bayesian hierarchical logistic regression models to response proportions in each of the three comparisons (A, B and C) as a function of trial number, factors antecedent (reference level object), register (reference level formal), gender of the pronoun (reference level masculine), and interaction of antecedent and register. We included trial number and gender of the pronoun as predictors to explain possible variance because of these variables in the data; at times participants’ responses show systematic effect of the trial order and, even though under our hypotheses we did not predict any effect of the gender of the pronoun, we did not want to rule out the possibility of its effect. The response proportions in the model for comparison A were proportion of die-demonstrative responses, in comparisons B and C were proportion of diese-demonstrative responses. In the models’ random-effects structure the intercept, the two predictors antecedent, register and their interaction varied by items and participants.
We fit the models using Stan modeling language (Carpenter et al. 2017) through the R package brms (Bürkner 2017). We used default priors of the brms package. Each model included four sampling chains that ran for 5000 iterations with a warm-up period of 1000 iterations. For each predictor we report its mean and 95% credible interval (CrI) under the posterior distribution. The 95% CrI specifies the interval within which we can be 95% certain that “true value” of the parameter lies given the data and the model specification (Nicenboim & Vasishth 2016). Following Franke & Roettger (2019, July 13), if the 95% CrI doesn’t include zero we consider that there is compelling evidence for the effect of that predictor on repose proportions. We also report the posterior probability of the effect being greater than zero or less than zero depending on the estimated parameter for that effect being positive or negative. The posterior probability is calculated by using the posterior sample for a parameter generated by the statistical model. It is simply the proportion of the sample being less than or greater than zero. Since the size of the posterior samples is finite (which depends on the number of chains and iterations, and is basically equal to the number of post-warmup samples) we could simply get the posterior probability for an effect to be exactly 1 or 0. Sometimes this value is exactly 1 or 0 also because of rounding the numbers to 2 decimals.
For the two experimental factors our main predictions for dieser-demonstratives are: (i) if language formality is a valid constraint, we expect diese-demonstratives to be chosen overall more often in Expt. 1a than in Expt. 1b, and (ii) if dieser-demonstratives avoid subject antecedents exactly the way die-demonstratives have been observed in earlier studies, we expect dieser-demonstratives to be chosen more often with object antecedent than with subject antecedent in comparisons B and C. Moreover, as a side effect, we expect to replicate earlier finding with die-demonstratives — they should be chosen more often with object antecedent than with subject antecedent in comparisons A and C.
The response percentages for each option across all comparisons and conditions are listed in Table 1 (see Figure A1 in the Appendix for a visual representation of this table in terms of barplots). Plots summarising combined proportion of responses for each option across different comparisons are displayed in Figure 1. Results of data analysis are listed in Table 2.3 Next we summarise the results while referring to these two tables and the plots.
|Comparison||Antecedent||Option||Formal (%)||Informal (%)|
|Comparison||Effect||Estimate||95% CrI||Post. Prob.|
Diese-demonstratives were used more often with object antecedents and in formal register in both, B and C, comparisons. In comparison B we also found compelling evidence for the interaction between antecedent type and formality, but it should also be noted that diese-demonstratives were rarely used in the informal register.
In comparison A, die-demonstratives were preferred with the object antecedents but they showed no effect of language register, in fact, they were used very rarely in general. In comparison C, die-demonstratives were never used in formal register but were used fairly often in informal register, and also more often with object antecedent.4 However, since for comparison C we kept die-demonstrative and ‘neither’ responses as the reference level to test the effects on diese-demonstrative, it was not possible to statistically also check the effect of register and antecedent type on die-demonstratives.
As far as the predictors for trial number and the gender of the pronoun are concerned, there was no compelling evidence for the effect of trial, but the gender of the pronoun seemed to have an effect in comparisons A and B — feminine die-demonstratives were used more often than masculine ones in A, whereas, masculine diese-demonstratives were used more often than feminine ones in B.
The results of the experiment support the intuition that diese-demonstratives require the formal language register to license their use. Moreover, when diese-demonstratives are used they prefer object antecedents over subject antecedents — a subject-avoidance preference that has been observed with die-demonstratives in the past research. These two constraints interacted in our experiments and we observed that the effect of subject-avoidance was much stronger in formal register. But since diese-demonstratives were used very rarely in informal register, even with object antecedents, the effect of formality is possibly much stronger.
Although the main purpose of the experiment was to test preferences of diese-demonstratives, as a control, we contrasted their behavior with die-demonstratives which have been studied more often in the past. With die-demonstratives we expected to replicate the object preference. The statistical analysis provided evidence for object preference, but since die-demonstratives were used so rarely we suggest that this effect is possibly not reliable. However, in comparison C we did see strong numerical trend suggesting an effect of register — die-demonstratives preferred in informal language over formal, and a possible effect of subject-avoidance. Since our priority here was of testing effects on diese-demonstratives, we had to keep the die-demonstrative responses as the reference level and hence the numerical trends we observed for die-demonstratives in comparison C could not be statistically tested. Effectively, we imply that our data weakly supported the subject-avoidance hypothesis for die-demonstratives and suggested a possibility of die-demonstratives preferring informal language.
A possible explanation for the absence of this effect could be that demonstrative pronouns from the die paradigm are dispreferred in the written modality, they are more common in the spoken modality (Ahrenholz 2007; Bosch et al. 2007). Portele & Bader (2016) have also observed in their corpus study and completion experiment that, even in an environment where die-demonstratives should be favoured, they were used less often than personal pronouns (see also the results from acceptability studies in Hinterwimmer & Patil (2019) that show similar effects of written modality on die- and diese-demonstratives). This result is also consistent with Weinert (2011) who notes that in spoken language demonstratives are used equally frequently (and are equally important) as personal pronouns but in written language demonstratives are dispreferred.
One interesting pattern that we observed in comparison C was that diese-demonstratives were used very often even with subject antecedents in the formal register. Since we did not explicitly instruct participants to consider the gender matching subject/object as the antecedent or did not probe which antecedent they considered when they performed the forced-choice task, it is in theory possible that some of the participants considered an antecedent that was not mentioned in the sentence. Given the design of the experiments we cannot rule out this possibility, however, we think that this effect could also be because of the constraint of language formality being very strong for dieser-demonstratives which licenses (or even promotes) its use in some of the cases despite the antecedent being the subject. If we compare the responses for comparison B with subject antecedents in formal register, we again see that dieser-demonstrative is preferred 19% of the time despite there being a legitimate option of personal pronoun available, which is used only about 63% of the time. Yet another factor that could have an influence on having this pattern is that die-demonstratives are dispreferred in written modality (along with dispreferring the subject antecedent) and that forced participants to choose diese-demonstrative (instead of always opting for the ‘neither’ option). However, without further experiments, it will not be possible to straightforwardly claim which factor or factors led to this pattern.
On a side note, we want to point out that although it may seem that the gender of the pronoun had an effect, as we mentioned earlier, we kept gender as one of the predictors so that the statistical model captures the variance in the data, if there is any, due to two different genders of the pronouns. It is interesting that the effect is strong in comparisons A and B, but strangely it is in the opposite direction; moreover, the effect is not strong for comparison C. Given this pattern of effects and the fact that we did not have any prediction about this effect in either direction, it does not seem informative to speculate what might be the reason behind it.
Another effect we observed that was not part of the set of hypotheses we wanted to test was the one with the ‘neither’ option — ‘neither’ was chosen more often than we expected, for example, in comparison C in the informal register, and with subject antecedent in the formal register. See Table A1 in the Appendix for model estimates of the statistical analysis for ‘neither’ across three comparisons. The result show effects of antecedent type and register type — ‘neither’ was used more often with subject antecedent than with object antecedent, and in the informal register than in the formal register. As mentioned earlier, the ‘neither’ option was mainly included as a way for participants to convey disapproval of either of the pronouns; since we expected that both demonstratives will be perceived unsuitable with subject antecedents the option of rejecting both was necessary, especially in the comparison where both pronoun alternatives were demonstrative pronouns, i.e. comparison C. Interestingly, in comparison C, ‘neither’ was chosen more frequently either in informal register or in formal register with subject antecedent, but not when the antecedent was the object in formal register. This pattern can be easily explained based on our earlier discussion. In the cases when ‘neither’ was frequent the other two alternatives were anyway considered illicit because die-demonstratives are dispreferred in general and diese-demonstratives are dispreferred in informal register and with subject antecedent (see earlier discussion for details). In the case when ‘neither’ was not used frequently one of the alternatives was the preferred one, i.e. the diese-demonstrative in formal register with object antecedent. This pattern, in fact, justifies inclusion of the ‘neither’ option along with two pronouns. There were other two cases when ‘neither’ was chosen around 20% of the time, both were the cases with formal register with subject antecedents across comparisons A and B. Although these percentages are not as high as those for ‘neither’ in comparison C and are much lower than the preferred option — the personal pronoun which was chosen around 60% and 80% of the time in these cases — they are nevertheless unexpected since personal pronoun is a completely suitable option in these cases. We do not have any explanation for this pattern, but this pattern also does not affect our main generalisations about diese-demonstratives.
One factor that we did not explicitly control for in our experiments was the dialectal variation of the participants. We recruited our participants through Osnabrück University›s student mailing list and Facebook posts. As a result the participants were not uniformly distributed across all German speaking regions (c.f. Figure A3 in the Appendix for the distribution of participants) and majority of them came from the north-western and northern parts of Germany (Osnabrück University is in the north-western part of Germany); hence, in theory, it is possible that our results are limited to the dialects spoken in these regions.
Although Expt. 1a and 1b supported the claim that diese-demonstratives require formal language to justify their use, and they demonstrate subject-avoidance like the die-demonstratives, it is still possible that the constraint suggested in Zifonun et al. (1997) — diese-demonstratives prefer the last mentioned entity as the antecedent — is also in operation. All the sentences used in the current design are such that the object in the sentence is the last mentioned entity, and, hence, the object antecedent preference that we see in the formal register is, in fact, a preference towards the last mentioned entity. In the next experiment we tease apart these two possible explanations.
In this experiment we test the claim that diese-demonstratives prefer the last mentioned entity as the antecedent (Zifonun et al. 1997). The example that is used in Zifonun et al. (1997) to support this claim, repeated here as (6), involves canonical (SVO) word order. Since in SVO sentences the last mentioned referent is also the grammatical object of the sentence, it is not possible to disentangle the role of object-hood from the role of being the last mentioned entity. Hence, it is possible that the preference for last mentioned entity could, in fact, be the preference for the object of the sentence as we observed in Expt. 1a. In Expt. 2 we carry out a forced-choice study in which we manipulate the argument order to isolate the contribution of linear order and grammatical role.
Fourty-nine self-reported native speakers of German were recruited through Prolific (https://prolific.ac/) (23 female, mean age = 29.82 years, age range = 18–54). For the final analysis three participants were ruled out because they failed “attention-check” tasks. Each participant was paid £2.63.
The experimental items consisted of two sentence discourses such as (7). Each item had two variations: (a) canonical continuation, such that the matrix clause in the second sentence was in canonical (SVO) word order, and (b) non-canonical continuation, such that the matrix clause in the second sentence was in non-canonical (OVS) word order. For both variations the first sentence, the context sentence, was the same. We created these items by adding a context sentence to the experimental items used in Expt. 1a and changing the gender of the feminine noun to masculine. The gender change was made to make sure that the antecedent of the pronoun was ambiguous between the subject and the object. We used sentences from Expt. 1a to make sure that we preserve the formal register to legitimize the use of diese-demonstratives. We also added two new sentences to the existing items from Expt. 1a to have 14 experimental items in total. There were 34 additional items from two unrelated experiments. To make sure that the participants were paying attention to the stimuli, we randomly placed two “attention-check” tasks in the experiment. These tasks simply asked the participants to click on one of the two options (for example, ‘Please click on Answer A’).
The experiment was programmed and hosted using Ibex Farm (http://spellout.net/ibexfarm/). Participants were shown the discourse and the comprehension question. They were given two options to choose from to answer the questions, effectively a two-alternative forced choice task. For the items from the current experiment they were asked a question such that the participants had to resolve the antecedent of the diese-demonstrative to either the subject or the object of the sentence (for example, see the ‘Comprehension question’ in (7)).
All data processing and analyses were carried out in R (R Core Team 2018). We fit Bayesian hierarchical logistic regression model to proportions for object responses as a function of word order (reference level canonical) and trial number. In the model’s random-effects structure the intercept and word order varied by items and participants. The rest of the model fitting details were same as that in earlier experiments.
The responses are plotted in Figure 2, and results of data analysis are listed in Table 3. We found that for both canonical and non-canonical word order, participants preferred object antecedents more often than the subject antecedents. There was also an effect of the word order implying that the preference for object antecedent was weaker in non-canonical condition than in canonical condition.
|Effect||Estimate||95% CrI||Post. Prob.|
|Word order||–0.883||[–1.79, –0.04]||0.98|
The goal of Expt. 2 was to test if diese-demonstratives prefer the last mentioned entity as the antecedent as proposed by Zifonun et al. (1997) or if this preference is simply due to the fact that last mentioned referents are usually the objects of the sentence. It is clear from the results that the object antecedent is preferred unequivocally over the subject antecedent independently of their order of mention suggesting that grammatical roles of the referents influence antecedent preferences for diese-demonstratives and not the linear order in which they are mentioned in the sentences.
Although there was a strong preference for object antecedent over subject antecedent, it interacted with the word order such that in non-canonical word order the object antecedent was preferred less often than in canonical word order (68% vs. 81%). One possible explanation for this interaction could be that diese-demonstratives behave the same way as die-demonstratives when the subject of the sentence is not in the clause initial position. The weakening of non-subject bias for die-demonstratives in non-canonical structures has been reported in a similar forced-choice antecedent selection task (Schumacher et al. 2016) and also in a sentence completion study (Bosch et al. 2007).
Another explanation could be that there is in fact an effect of the proximity of the referents in the sentence (Zifonun et al. 1997): since the subject is closer to the demonstrative in the non-canonical structure, it becomes more available as the antecedent of the diese-demonstrative and we see interaction between subject-avoidance and order of mention. How exactly subject-avoidance and word order interact for diese-demonstratives is indeed an interesting question but, as far as the goals of this paper are concerned, it is sufficient to find out that the effect of subject-avoidance comes out to be much stronger than that of the order of mention. Overall, Expt. 2 supports our conclusion in Expt. 1 that diese-demonstratives prefer grammatical objects over subjects as their antecedent, a subject-avoidance strategy as it has been reported in the past for die-demonstratives.
In this paper we experimentally tested constraints on German demonstrative pronouns from the diese paradigm. Combined results from Expt. 1a and Expt. 1b showed that demonstratives from the diese paradigm are strongly preferred in a formal language environment. These experiments also provided initial evidence to our supposition that diese-demonstratives prefer object antecedents over subject antecedents, a subject-avoidance hypothesis which has been proposed for demonstrative pronouns from the die paradigm. In contrast, die-demonstratives are strongly dispreferred in a formal language environment. In fact, since die-demonstratives were not used very often even in an informal environment with object antecedents, we suggest that there could be some influence of the written modality such that die-demonstratives are less preferred in written language.
Using results from the next experiment we could tease apart two possible explanations for the antecedent preference for diese-demonstratives: (i) the subject-avoidance hypothesis, and (ii) a preference to resolve the demonstrative to the last mentioned entity. The conclusion from this experiment was that diese-demonstratives avoid the subject antecedent. In sum, given the results of the current experiments, we propose that the use of diese-demonstratives in German is licensed by a formal language register. But in terms of their antecedent preference they show subject-avoidance similar to the way it has been observed for demonstratives from the die paradigm.
Now the question arises — where is the formality distinction between the two types of demonstratives encoded? We assume, following Smith et al. (2010); McConnell-Ginet (2011); Acton (2014); Beltrama (2016); Burnett (2019), that social meaning is best understood in terms of inferences similar to conversational implicatures. The idea is that socially meaningful variants of expressions such as the use of –in’ vs. -ing in (8) (Labov 2006) index sets of properties, stances or concepts that the speaker wants the hearer to attribute to her (Ochs 1993; Silverstein 2003; Eckert 2008). In each situation, the speaker thus selects the variant that she can reasonably assume to trigger inferences on the listener’s part concerning the speaker’s properties, stances etc. that she deems most useful to her in that situation. Assuming that -in’ indexes the properties incompetent, friendly, while -ing indexes the properties competent, aloof, a speaker will most likely use -in’ more often in informal settings where being seen as friendly and approachable is more important than being seen as competent, while in formal setting, where being seen as competent is more important than being seen as friendly, it is the other way around (see Burnett 2019 for a formal implementation in terms social meaning games).
|(8)||a.||I’m workin’ on a new paper.|
|b.||I’m working on a new paper.|
We tentatively assume that diese- and die-demonstratives are variants in the following sense: Both are marked pronouns (as opposed to the unmarked personal pronouns of the er/sie/es paradigm) that as such avoid maximally prominent antecedents, where maximal prominence corresponds to subjecthood, and that do not differ with respect to truth-conditional meaning, but with respect to the properties that they index. For expository purposes, let us just take the same properties as the ones suggested for -in’ vs. -ing above. In a rather formal setting where it is more important to be seen as competent than to be seen as friendly and approachable, a speaker will thus use diese more often than die in order to pick up a female non-subject antecedent, while in an informal setting where it is the other way around, a speaker will use die more often than diese to pick up a non-subject antecedent. The picture becomes more complicated if we take the potential difference between die and diese with respect to (anti-)logophoricity into account that will be discussed in the final section, since the diese pronouns can then no longer be seen as simple variants. For the purposes of this paper, we will set this issue aside, however, and leave it open as a question for future research.
Interestingly, both die and diese can not only be used as pronouns, but also as determiners: die corresponds to the definite determiner, and diese to the demonstrative determiner (in modern German, the distinction between proximal and distal demonstratives is no longer active, i.e. diese is the only demonstrative determiner in active use, with the distal determiner jene sounding rather archaic and stilted). While it would be attractive to relate the properties of the pronouns to the properties of the corresponding determiners, this is far from straightforward. While both complex demonstratives and definite descriptions can be used anaphorically in German just like in English (see, e.g., Abbott 2002 for discussion of their uses in so-called donkey sentences where they pick up discourse referents introduced by indefinites), there does not seem to be any evidence that they tend to avoid subjects: (9a) are just as fine as (9b).
Additionally, at least according to some native speakers’ intuitions, it does not seem to be the case that anaphoric uses of complex demonstratives are more acceptable in formal register, while anaphoric uses of definite descriptions are more acceptable in informal register. Rather, both seem to be acceptable in both registers (but see Hinterwimmer 2019 for discussion of other differences entirely unrelated to the topic of this paper). We will therefore set aside the relation between the pronouns and the corresponding determiners as far as this paper is concerned.
Let us now return to the contrast between die- and diese-demonstratives that was mentioned in the introduction. According to Zifonun et al. (1997), dieser in the second sentence in (10) can only be understood as referring to the Benz because of the preference diese-demonstratives have for the last mentioned entity, resulting in an extremely implausible interpretation. In contrast, der is not constrained in this way and can thus, just like er, be understood as referring to Peter, which results in a plausible interpretation. However, we have seen clear evidence in Expt. 2 that in sentences with fronted objects, diese-demonstratives have stronger preference to refer to the first mentioned object over the last mentioned subject. So how can we account for the contrast between der and dieser in (10)?
We suggest that this contrast could be because the die-demonstratives avoid the maximally prominent perspective takers, as proposed by Hinterwimmer & Bosch (2016; 2018), and here Peter is not the maximally prominent perspective taker, on the other hand, the diese-demonstratives possibly just avoid the subject. Hinterwimmer & Bosch (2016; 2018) proposed that die-demonstratives are actually anti-logophoric pronouns avoiding (maximally prominent) perspective takers. They observe using cases such as (11a) that der can easily refer to Stefan, while in (11b) such an interpretation is much harder to get. Obviously, this has nothing to do with subject- or topic-avoidance, since the proper name Stefan is not only the subject of the preceding sentence, but Stefan is also the discourse topic. Rather, what seems to matter is that (11a) clearly expresses an evaluation of Stefan by the narrator (or speaker, if it is uttered in oral conversation), while (11b) is most likely understood as expressing a thought of Stefan given as Free Indirect Discourse (see Eckardt 2014 and the references therein for details).
Similarly, in (10), the der-variant of the second sentence clearly expresses an evaluation of Peter by the speaker (assuming that the sentence is uttered in an oral conversation) and thus resembles (11a) in making an external perspective salient. Hence despite being the subject of the preceding sentence (and presumably also the discourse topic), Peter becomes available as an antecedent for der. Interestingly, the picture changes when we replace (10) by (12). In the absence of a prominent external perspective taker, being the discourse topic Peter is maximally prominent and therefore avoided as an antecedent by der. Now, both the der and dieser have a strong preference towards the Benz, resulting in an equally implausible interpretation.
Effectively, what we are suggesting is that diese-demonstratives just avoid subjects as antecedents, while die-demonstratives avoid the most prominent discourse referents. Alternatively, one could assume that diese-demonstratives likewise avoid the most prominent discourse referents as antecedents, but that for them perspective-taking plays no role in calculating prominence. On the whole, it seems conceivable that the demonstratives from both paradigms behave similarly as far as avoiding the most prominent discourse referent is concerned, but they diverge along the dimensions of language formality and logophoricity.
Here we have tested the behavior of diese-demonstratives only at the single sentence level such that the pronoun and the antecedent occurred in the same sentence, and we used limited grammatical variations — accusative verbs in SVO and OVS word order with only animate antecedents. We know from earlier work with die-demonstratives that, although, initially they seemed to be subject to a constraint based on subjecthood (Bosch et al. 2007), later their constraint set was extended to topicality (Bosch & Umbach 2007; Hinterwimmer 2015), agentivity (Schumacher et al. 2017) and perspectival centerhood (Hinterwimmer & Bosch 2018). It is quite possible that we see a similar extension of the constraint set for diese-demonstratives. In order to decide between these options, clearly more empirical research is required, especially in cases where topicality and subjecthood or agentivity diverge.
As far as the language register is concerned, the formality itself could be motivated by various factors such as seriousness, or unfamiliarity, or social distance, or deference (Irvine 1979) and the use of diese-demonstratives could be influenced differently by each of these factors. Moreover, apart from formal and informal styles, a language could have other styles such as frozen, consultative, intimate as described by Joos (1961) in “The Five Clocks” and the use of demonstratives could also differ in these three styles. However, evaluating how different factors affecting formality have influence on diese-demonstratives and if other variations in style can influence diese-demonstratives is beyond the scope of this paper and we leave that for the future work.
All data, analysis and supplementary files used for the article are available at: https://doi.org/10.17605/OSF.IO/2U3KC
1In the literature die-demonstratives have also been referred to by d-pronouns or DPros and diese-demonstratives simply by demonstratives. For the sake of simplicity in contrasting these two we will refer to them as die- and diese-demonstratives.
2Here we treat diese-demonstratives as linguistic devices for attention orientation and as interfaces to the other cognitive modules, the way die-demonstratives have been proposed to be in Bosch & Hinterwimmer (2016) and Schumacher et al. (2015). Although most of the findings that we have discussed for die-demonstratives are related to across-sentence pronoun resolution, we do not intend to make the distinction in terms of within- versus across-sentence pronoun resolution for dieser-demonstratives.
3Since we had only 12 experimental items in each experiment, as a sanity check, we have included a caterpillar plot for random intercepts for items in the Appendix (Figure A2). The plot is for the model fit for comparison B. Visual inspection of the plot doesn’t show strong effect of any specific set of items. We also tried analyzing data by dropping four items that showed maximum deviation, but the effects remained essentially the same.
4We thank one of the reviewers for pointing out these effects.
DEM = demonstrative, GEN = genitive, DAT = dative, M = masculine, F = feminine, NEG = negation
We would like to thank our student assistants and lab members — David Brinkhaus, Felix Jüstel, Sara Meuser, Lea Penning, Carina Rothkegel, Janne Schmandt, Magdalena Schmitz and Steffen Vogel — for their help with carrying out the experiments and participation in discussions. The paper improved substantially during the review process — many thanks to the reviewers for their insightful and timely comments. We also thank the Glossa team for facilitating the (open access) publication process.
The work on this paper was funded by the DFG as part of the project Die Bindungseigenschaften von Demonstrativpronomen, komplexen Demonstrativa und definiten Beschreibungen (DFG project number – 248411105).
The authors have no competing interests to declare.
Abbott, Barbara. 2002. Donkey demonstratives. Natural Language Semantics 10(4). 285–298. DOI: https://doi.org/10.1023/A:1022141232323
Abu Sheikha, Fadi & Diana Inkpen. 2010. Automatic classification of documents by formality. In Proceedings of the 6th international conference on natural language processing and knowledge engineering(nlpke-2010), 1–5. DOI: https://doi.org/10.1109/NLPKE.2010.5587767
Ahrenholz, Bernt. 2007. Verweise mit Demonstrativa im gesprochenen Deutsch. Grammatik, Zweitspracherwerb und Deutsch als Fremdsprache. Berlin, Boston: de Gruyter. DOI: https://doi.org/10.1515/9783110894127
August, Tal & Katharina Reinecke. 2019. Pay attention, please: Formal language improves attention in volunteer and paid online experiments. In Proceedings of the 2019 chi conference on human factors in computing systems (CHI ’19), 248:1–248:11. New York, NY, USA: ACM. DOI: https://doi.org/10.1145/3290605.3300478
Biber, Douglas. 1988. Variation across speech and writing. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511621024
Bosch, Peter & Carla Umbach. 2007. Reference determination for demonstrative pronouns. In Dagmar Bittner & Natalia Gagarina (eds.), Proceedings of the conference on intersentential pronominal reference in child and adult language, 39–51. ZAS Papers in Linguistics.
Bosch, Peter, Graham Katz & Carla Umbach. 2007. The non-subject bias of german demonstrative pronouns. In Monika Schwarz-Friesel, Manfred Consten & Mareile Knees (eds.), Anaphors in text: Cognitive, formal and applied approaches to anaphoric reference, 145–164. Amsterdam: John Benjamin. DOI: https://doi.org/10.1075/slcs.86.13bos
Bosch, Peter & Stefan Hinterwimmer. 2016. Anaphoric reference by demonstrative pronouns in german. In Anke Holler, & Katja Suckow (eds.), Empirical perspectives on anaphora resolution. 193–212. Berlin/New York: De Gruyter.
Bürkner, Paul-Christian. 2017. brms: An r package for bayesian multilevel models using stan. Journal of Statistical Software, Articles 80(1). 1–28. DOI: https://doi.org/10.18637/jss.v080.i01
Burnett, Heather. 2019. Signalling games, sociolinguistic variation and the construction of style. Linguistics and Philosophy 42(5). 419–450. DOI: https://doi.org/10.1007/s10988-018-9254-y
Carpenter, Bob, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li & Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of Statistical Software, Articles 76(1). 1–32. DOI: https://doi.org/10.18637/jss.v076.i01
Eckardt, Regine. 2014. The semantics of free indirect discourse. Leiden, The Netherlands: Brill. DOI: https://doi.org/10.1163/9789004266735
Eckert, Penelope. 2008. Variation and the indexical field. Journal of Sociolinguistics 12(4). 453–476. DOI: https://doi.org/10.1111/j.1467-9841.2008.00374.x
Ellert, Miriam. 2013. Information structure afects the resolution of the subject pronouns er and der in spoken german discourse. Discours 12. 8756. DOI: https://doi.org/10.4000/discours.8756
Franke, Michael & Timo B. Roettger. 2019, 13. Bayesian regression modeling (for factorial designs): A tutorial. DOI: https://doi.org/10.31234/osf.io/cdxv3
Haegeman, Liliane. 2013. The syntax of registers: Diary subject omission and the privilege of the root. Lingua 130. 88–110. SI: Syntax and cognition: core ideas and results in syntax. DOI: https://doi.org/10.1016/j.lingua.2013.01.005
Heylighen, Francis & Jean-Marc Dewaele. 2002. Variation in the contextuality of language: An empirical measure. Foundations of Science 7(3). 293–340. DOI: https://doi.org/10.1023/A:1019661126744
Hinterwimmer, Stefan. 2015. A unified account of the properties of demonstrative pronouns in german. In Patrick Grosz, Pritty Patel-Grosz & Igor Yanovich (eds.), Workshop on pronouns at the 40th conference of the north eastern linguistic society, nels 40. 61–107. GLSA Publications.
Hinterwimmer, Stefan. 2019. How to point at discourse referents: On anaphoric uses of complex demonstratives. In M. Teresa Espinal, Elena Castroviejo, Manuel Leonetti, Louise McNally & Cristina Real-Puigdollers (eds.), Proceedings of sinn und bedeutung 23. 487–505. DOI: https://doi.org/10.18148/sub/2019.v23i1.546
Hinterwimmer, Stefan & Peter Bosch. 2016. Reference determination for demonstrative pronouns. In Patrick Grosz & Patel-Grosz Pritty (eds.), The impact of pronominal form on interpretation, 189–220. Berlin, Boston: De Gruyter.
Hinterwimmer, Stefan & Peter Bosch. 2018. Demonstrative pronouns and propositional attitudes. In Pritty Patel-Grosz, Patrick Georg Grosz & Sarah Zobel (eds.), Pronouns in embedded contexts at the syntax-semantics interface, 105–144. Springer (Studies in Linguistics and Philosophy). DOI: https://doi.org/10.1007/978-3-319-56706-8_4
Hinterwimmer, Stefan & Umesh Patil. 2019. A comparison of anaphoric complex demonstratives and demonstrative pronouns. Talk presented at the workshop sorting out the concepts behind definiteness at the 41st DGfS. http://www.dgfs2019.uni-bremen.de/
Irvine, Judith T. 1979. Formality and informality in communicative events. The Anthropology of Language 81. 773–790. DOI: https://doi.org/10.1525/aa.1979.81.4.02a00020
Kaiser, Elsi. 2011. On the relation between coherence relations and anaphoric demonstratives in german. In Ingo Reich, Eva Horch & Dennis Pauly (eds.), Proceedings of sinn und bedeutung 15. 337–351. Saarbrücken: Saarland University Press.
Kaiser, Elsi & John C. Trueswell. 2008. Interpreting pronouns and demonstratives in finnish: Evidence for a form-specific approach to reference resolution. Language and Cognitive Processes 23(5). 709–748. DOI: https://doi.org/10.1080/01690960701771220
Labov, William. 2006. The social stratification of english in new york city. Cambridge University Press 2nd edn. DOI: https://doi.org/10.1017/CBO9780511618208
Lahiri, Shibamouli, Prasenjit Mitra & Xiaofei Lu. 2011. Informality judgment at sentence level and experiments with formality score. In Alexander Gelbukh (ed.), Computational linguistics and intelligent text processing, 446–457. Berlin, Heidelberg: Springer Berlin Heidelberg. DOI: https://doi.org/10.1007/978-3-642-19437-5_37
Nicenboim, Bruno & Shravan Vasishth. 2016. Statistical methods for linguistic research: Foundational ideas—part ii. Language and Linguistics Compass 10(11). 591–613. DOI: https://doi.org/10.1111/lnc3.12207
Ochs, Elinor. 1993. Constructing social identity: A language socialization perspective. Research on Language and Social Interaction 26(3). 287–306. DOI: https://doi.org/10.1207/s15327973rlsi2603_3
Pavlick, Ellie & Joel Tetreault. 2016. An empirical analysis of formality in online communication. Transactions of the Association for Computational Linguistics 4. 61–74. DOI: https://doi.org/10.1162/tacl_a_00083
Portele, Yvonne & Markus Bader. 2016. Accessibility and referential choice: Personal pronouns and d-pronouns in written german. Discours 18. 9188. DOI: https://doi.org/10.4000/discours.9188
R Core Team. 2018. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org
Ricks, Jacob I. 2018. The effect of language on political appeal: Results from a survey experiment in Thailand. Political Behavior. DOI: https://doi.org/10.1007/s11109-018-9487-z
Schumacher, Petra B., Jana Backhaus & Manuel Dangl. 2015. Backward- and forward-looking potential of anaphors. Frontiers in Psychology 6. 1746. DOI: https://doi.org/10.3389/fpsyg.2015.01746
Schumacher, Petra B., Leah Roberts & Juhani Järvikivi. 2017. Agentivity drives real-time pronoun resolution: Evidence from german er and der. Lingua 185. 25–41. DOI: https://doi.org/10.1016/j.lingua.2016.07.004
Schumacher, Petra B., Manuel Dangl & Elyesa Uzun. 2016. Thematic role as prominence cue during pronoun resolution in german. In Anke Holler & Katja Suckow (eds.), Empirical perspectives on anaphora resolution, 121–147. Berlin/New York: De Gruyter.
Schütze, Carson T. & Jon Sprouse. 2014. Judgment data. In Robert J. Podesva & Devyani Sharma (eds.), Research methods in linguistics. 27–50. Cambridge University Press. DOI: https://doi.org/10.1017/CBO9781139013734.004
Silverstein, Michael. 2003. Indexical order and the dialectics of sociolinguistic life. Language & Communication 23(3). 193–229. DOI: https://doi.org/10.1016/S0271-5309(03)00013-2
Smith, E. Allyn, Kathleen Hall & Benjamin Munson. 2010. Bringing semantics to sociophonetics: Social variables and secondary entailments. Laboratory Phonology 1. 121–155. DOI: https://doi.org/10.1515/labphon.2010.007
Vasishth, Shravan, Bruno Nicenboim, Mary E. Beckman, Fangfang Li & Eun Jong Kong. 2018. Bayesian data analysis in the phonetic sciences: A tutorial introduction. Journal of Phonetics 71. 147–161. DOI: https://doi.org/10.1016/j.wocn.2018.07.008
Weir, Andrew. 2017. Object drop and article drop in reduced written register. Linguistic Variation 17(2). 157–185. DOI: https://doi.org/10.1075/lv.14016.wei
Winter, Bodo & Sven Grawunder. 2012. The phonetic profile of korean formal and informal speech registers. Journal of Phonetics 40(6). 808–815. DOI: https://doi.org/10.1016/j.wocn.2012.08.006