Most open-class words that we use are polysemous, that is, most open-class word forms are associated with several related senses. Typical examples are in (1)–(4):
|(1)||a.||The manifesto was signed by the University.|
|b.||I have a meeting with Laura at the university.|
|(2)||a.||The letter from the Council is on the table.|
|b.||The letter from the Council sounds a bit threatening.|
|(3)||a.||Brazil is a republic.|
|b.||Brazil has won five World Championships.|
|c.||Brazil is the largest country in South America.|
|(4)||a.||The school caught fire.|
|b.||The school celebrates the end of the year party tomorrow.|
In (1)–(4) words in bold have different senses. For example, in (1) there are two different senses of the word university: in (1a) University refers to the organization or, more precisely, to the people ruling the organization, whereas in (1b) it refers to a building. In (2a) we are talking about a piece of paper, a physical object that is on the table, while in (2b) we are talking about what that paper says: its informational content. In (3) Brazil is a political institution (3a), a football team (3b), and a geographical region (3c). In (4), we have two different senses of school: as a building (4a), and as a group of people that celebrate something (4b). The different senses of these words are closely related but they are different: they denote different entities. The words in bold in (1)–(4) exemplify a kind of conventional polysemy. It is customary to distinguish between conventional and non-conventional polysemy (Falkum 2011; Carston 2015; see also Gerrig 1989): conventional senses of polysemous words are lexicalized and retrieved via sense-selection, whereas in non-conventional polysemy, senses have to be generated online via pragmatic mechanisms. It is also habitual to refer to the kind of polysemy we will focus on as “inherent polysemy” (Pustejovsky 1995).
Pustejovsky (1995) introduced a tripartite distinction that applies to the phenomenon of polysemy: some polysemes are idiosyncratic or accidental (window in a window of opportunity), whereas some are regular.1 Apresjan (1974: 16) characterizes regular polysemy as follows: the polysemy of a word a with the senses Ai and Aj is regular if there exists at least one word b with the related senses Bi and Bj, being semantically distinguished in exactly the same way as Ai and Aj.2 Now, according to Pustejovsky, some terms are merely regularly polysemous, while others are said to be inherently polysemous. A term is inherently polysemous, according to the Pustejovskyan approach, if the different senses are somehow “inherent” to the entity that the term denotes. For instance, the two senses of book as a physical object and as an informational content seem to somehow emerge from what a book is. Prima facie, this characterization of inherent polysemy is controversial and vague, as it does not really tell when a certain polysemy pattern displays inherent polysemy or not.
However, there is actually a distinction to be drawn within the regular polysemy camp, for some regular polysemes pass co-predication and anaphoric binding tests and others do not.3 Co-predication occurs when one polysemous nominal expression has simultaneous predications selecting for two different meanings or senses. Anaphoric binding of different senses of a polysemous noun occurs when a pronominal expression takes a sense that is different from the sense attributed to its antecedent noun. (5)–(8) are some examples of polysemous expressions that pass co-predication and anaphoric binding tests:
|(5)||a.||The school that caught fire was celebrating 4th of July when the fire started.|
|b.||The school caught fire. It was celebrating 4th of July.|
|The city has 500,000 inhabitants and outlawed smoking in bars last year.|
|b.||The city has 500,000 inhabitants. It outlawed smoking in bars last year.|
|(7)||a.||The best university of the country has caught fire.|
|b.||X is the best university in the country. It has caught fire.|
|(8)||a.||The beer Susan was drinking fell out of her hands.|
|b.||Susan was drinking her beer. It fell out of her hands.|
In (5) school has two different senses: “building”, and what we can call “participants” (kids, teachers, etc.). In (6) city has the senses “place” and “council”. University in (7) means “institution” and “building”. Finally, in (8) beer has the senses “beer-drinkable” and “glass”.
Co-predication, which is the phenomenon we will mainly focus on, can involve a larger number of senses:
|(9)||Brazil is a large Portuguese-speaking republic that is very high in inequality rankings but always first in the FIFA ranking.|
|(10)||The nearest school, which starts at 9:00, fired some teachers and forbade hats in the classroom.|
In contrast, some combinations of senses that also involve nouns do not allow co-predications, or at least, they do not sound as “natural”:
|(11)||?The newspaper fired its editor and fell off of the table.|
|(12)||?The bottle that Susan was drinking fell out of her hands (see Schumacher 2013).|
Co-predication generates some puzzles that must be solved. Here we would like to focus on two main issues: (i) why there are some senses that co-predicate and others that do not, and (ii) how we interpret co-predicative sentences. The paper will focus on those groups of senses that allow co-predication in an especially robust and stable way, i.e., among others, cases like book (bill, letter, etc.); cases like school (university, newspaper, church, etc.); and cases like country (city, village, etc.). Those cases exemplify respectively the informational content/physical object alternation; the institution/building/staff/rules/representatives alternation; and the geographical region/inhabitants/rulers/political system/representatives alternation. Using these cases, we will explain how the senses involved form especially robust activation packages that allow hearers and readers to access all the different senses in interpretation. We will argue: (i) that senses which are more closely related have higher rates of co-activation; (ii) that senses will be particularly closely related when the entities they denote are linked by explanatory/realization relations; (iii) that mutual enduring co-activation facilitates co-predication; and (iv) that interpreting co-predicative senses involves selecting senses from an activation package (such that the selection of one does not inhibit the possibility of selecting another if required).
This is a psychological approach to co-predication, an approach that has been scarcely explored in the literature. The main extant accounts of co-predication come from theoretical semantics (Pustejovsky 1995; Asher 2011; Arapinis & Vieu 2015; Gotham 2017). In this tradition, co-predicative nouns are typed as “dot types”, complex types formed by two or more simple types. Thus, book is said to be of the type “information•physical_object”. An ongoing debate within the tradition concerns what kind of denotations dot types have; that is, what can the denotation of book be such that books can be both informational contents and physical objects? The very idea of inherent polysemy suggests that denotations of dot types have to be either complex entities or dual-faceted entities. This kind of view, in turn, could explain why books can have properties that are typical of physical objects and properties that are typical of informational entities.
Some authors indeed argue that books are mereological compounds or complex entities formed by two entities, a volume and a particular kind of content (Arapinis & Vieu 2015; Gotham 2017). This kind of proposal would answer our question (i) above: when co-predication is possible and when it is not. The response would be that co-predication is possible when the denotation of a polysemous word is a complex entity. However, this is not a completely satisfactory answer, because we still need to know the rationale for including some complex entities in our ontology (while we exclude many others): why is there a complex entity formed by a volume and some informational content, and not a complex entity formed by a balloon and a group of people? As we will see later on, there are some interesting responses in the literature (see especially, Arapinis & Vieu 2015). Still, there are two pressing problems that affect these kinds of views. We want to present them here briefly to better motivate our own approach.
One objection raised by Asher (2011), which has been amply discussed in the literature, concerns whether the alleged dot objects (understood here as the complex objects that dot types are said to denote), can be objects or entities that can be counted. Asher (2011) argues that we can only count informational books or physical books, not pairs formed by physical objects (tomes) and contents (texts).4 The other pressing objection against the complex entities view relates to the individuation and persistence conditions of the alleged complex entities: it is not clear when one of these complex entities comes in and goes out of existence. In particular, it seems that these alleged complex entities can survive if only one of its constitutive parts survive. However, if this is so (which by itself is problematic), then we would have too many “survivors”.5
These kinds of metaphysical approaches towards co-predication (the mereological approach as well as Asher’s “bare particulars” approach in fn. 4) are motivated by the idea that co-predicative sentences have a single NP that refers to some entity. In our approach, such NPs are compilations of senses, each with its own denotation. Such an approach will be metaphysically less problematic. However, the main focus of this paper is not this problem about denotations (which we will however address), but to try to build an account of co-predication that can be explanatory. The account about the denotation of co-predicative terms that we offer will stem out of our approach towards when and how co-predication is possible.
In the following section, we present and discuss our case study: school. School is a better exemplar than book and other similar cases discussed in the literature because school has many more senses than book. Also, examples like school better challenge a simple complex “entity” account. Intuitively, a book can be taken to be a thing with two aspects, the object and the content, but it is more difficult to think about a school as one complex entity composed of nine/ten aspects. We argue that school links to a knowledge structure that stores the typical senses of the word. Such knowledge structure includes only information concerning what a school typically is and how it is realized or implemented. That is, the knowledge structure includes just a part of the world knowledge associated with school. We show that there can be a principled reason to include some parts of world knowledge in the knowledge structure and leave some other parts out. Then, in Section 3, we argue that the senses that form part of the structure form an activation package, such that the activation of one of the senses causes the activation of the others. We will hold that such a co-activation pattern is due to the particularly close relationship found among the different senses in the structure. We will examine the evidence for and against the existence of such activation packages.
In Section 4, we return to the issue of co-predication. We argue that a sufficient condition for co-predication is that senses form activation packages. Actually, the “naturalness” of co-predicative structures correlates with senses forming part of an activation package. Interpretation of co-predicative sentences involves a process of de-compilation: hearers or readers have to pair each predicate with a different sense that forms part of a package activated by a word. In the last section, we tackle the issue of how to interpret our account within the recent discussion concerning “rich” or “thin” approaches to polysemy and lexical meaning. We take it that our account fits well within a rich semantic theory of polysemy, at least insofar as polysemy and co-predication are understood as being phenomena that a semantic theory has to account for.
2 Case study: school
Many theories of polysemy propose that specific senses are parts or aspects of a rich semantic/conceptual structure (Cohen 1971; Cruse 2004). In the view that is proposed here, these aspects are organized in a complex informational structure (or knowledge structure), which contains knowledge about the denotation of a word. In the multiply polysemous word school, such knowledge includes information about the kind of entity a school is, as well as information about its physical and temporal realization or implementation, the kind of people that take part in it, and its organizational structure. Such stored information is intended to capture the prototypical knowledge that we have about schools, as well as how the different aspects of the informational structure relate to each other. The structure that we propose for school is the following (represented in Figure 1).
The structure in Figure 1 represents that a school is, categorically, an institution, and that its telos is educating. Given the particular “telos” or purpose that schools have, it is expected (i) that schooling takes some time, which can be thought of as a process; (ii) that this process is in turn temporally organized, and (iii) that schools have students and teachers as participants or “inhabitants”. All these features, aspects or pieces of knowledge, are linked to, and to some extent depend on, our conceptualization of school as a social institution, whose purpose is to educate students. A way of seeing how these pieces of information are all kept together or form a coherent, robust, structure, is to think about all of them as specifying different realizers or implementers of an abstract entity (i.e., as things that the institution requires to be actualized in the world). Institutions in general require some physical realization, be it headquarters or something more modest. Schools, given that they are educative institutions, typically require buildings for their realization. Buildings have an inside and an outside part, as well as “occupants”, which are the same as the participants of the institution. Also, institutions require an organization, which includes interfaces with society at large, its representatives.6 For instance, schools have directors, commissions, etc., as well as rules and representatives that represent the school in different environments. The telos or function of the school also imposes some requirements: schools, being for educating people, need to have students and teachers, called here “participants”. The educating process has to be structured in years, terms and daily schedules. Summarizing, the knowledge structure associated with school is organized on the basis of our understanding of what a school is and how it is actualized in the real world. It is possible that we are leaving out some important aspects of our core knowledge of what schools are, but we think that all the components that we have included have to be there. Note, finally, that all the inferences above (“if it is an institution, then it has x or y”) are default inferences. We are trying to capture the core and typical knowledge associated with “school”. Part of such typical knowledge includes that schools, being institutions, require some physical object that hosts it. This does not exclude that, temporarily at least, an institution can exist without having any physical instantiation.
Our proposal follows the tradition inaugurated by Moravcsik (1975), which was made influential by Pustejovsky (1995). In this tradition, they use the Aristotelian theory of the four causes to characterize the possible kinds of objects. Namely, a kind can be specified by its material, efficient, formal, and final causes, that is, what objects of the kind are made of, what brings them into being, what they essentially are, and what purpose they have. In Pustejovsky’s approach, causes are called “qualia”, and together they provide the lexical meaning of the noun that refers to the kind. The “formal quale” provides the type, the “agentive quale” the efficient cause, the “constitutive quale” the material constitution, and the “telic quale” the purpose or function. According to Pustejovsky’s account, the lexical meaning of all nouns can be specified by specifying its four qualia. However, qualia can also take null values: we do not believe, in contrast to Aristotle, that all objects, including natural objects, have a telos. Therefore, the telic quale of a natural kind term lacks a value. In such a case, the lexical meaning of a natural kind term would be specified by three qualia, instead of four.
On the other hand, a simple qualia structure formed by the usual four qualia is too simple to capture the lexical meaning of a noun such as school. This is where dot types come in. Given that school can take different denotations and accepts very different kinds of predicates, its formal quale is assumed to be of a complex type. However, there emerges the problem of what school stands for in co-predicative sentences. We propose to enlarge the qualia-style structure (when necessary), adjusting it to a word’s respective ontological kind. There is no principled reason to limit the information that tells us what a certain entity is to a given, fixed, number of (four) features. Del Pinal (2017), for instance, adds information that specifies the stereotypical appearance of the entity, and, as mentioned, Pustejovsky himself concedes that some entities are characterized by three features: if a concept is an artefact concept, it will have a function (a telic quale); if it is a natural kind concept, it will not, etc. That is, the formal quale, which specifies the class to which the denotation belongs, tells you what other qualia will form part of the meaning of the nominal.
Building on the above, our proposal is that the knowledge structure – for the time being, we refrain from using “lexical meaning”– associated with. e.g., an institutional kind includes information about its telos, its social realization, and its physical realization. The characterization of social institutions such as hospitals, universities, newspapers, banks, etc., involves knowing what they are for, how they are organized, and what kind of physical entity hosts them and makes them actual. Further specific knowledge derives from the particular telos of the institution. The knowledge structure associated with countries, for instance, is different: countries are political institutions that require a geographical region, inhabitants, a system of government, an economic system, and representatives that channel their relationships with other countries.
Qualia structures were proposed to account for the fact that nouns offer different “aspects” or “facets” (Cruse 2004) for predication in a systematic way. An account like ours, besides explaining “inherent” polysemy and co-predication (see below), is able to explain the linguistic facts that qualia structures are supposed to explain without resorting to dot objects or types. Thus, the different kinds of predicates that school combines with are traceable to the knowledge structure that we propose. The modification to the Pustejovskyan schema that we introduce consists of assigning different kinds of knowledge structures to different classes of entities. Like in the original qualia structures, the aspects that we take into consideration are the aspects or features that characterize a certain kind. In the kinds we consider, such features include features that typically realize, make actual or implement the kind. The rationale for including these features is that such features do characterize the kinds and are typically available for being the objects of predication. This holds for institutions and their different ways of being part of the world (as social organizations, as part of society, as physical entities, etc.), as well as for countries, informational contents, drinkables, and so on. Drinkables, for instance, require containers to be “actualized” as such drinkables: this is the reason why we would include a “container” aspect in the knowledge structure corresponding to beer, for instance (see below on container/content and content/container). Informational contents, in turn, require physical realizations, which explains why the reference to a physical object will be in the knowledge structure associated to book, letter, etc. In sum: we think that more features than the four Aristotelian causes are relevant (i) to characterize a kind, and (ii) to account for predication. Some of these features or aspects relate to ways of making certain (typically abstract) kinds real.7
Now, if we look at (13) we see that the aspects or features that characterize schools are the typical senses of school, which means the denotations that school typically can have:
|(13)||a.||The school [building] is on fire.|
|b.||She went to school [process].|
|c.||School [timetable] starts at 9:00.|
|d.||The school [rules] has prohibited wearing hats in the classroom.|
|e.||I have talked to the school [director, staff] about it already.|
|f.||The school [participants] went for a visit to the Cathedral.|
|g.||The school [representatives] reached an agreement with the council.|
We take it that different aspects represented in a structure (see Figure 1) form an activation package. By this we mean (i) that all the elements in the structure are typically activated upon encountering the word school, and (ii) that any sense in the package activates all the others. We discuss the rationale and implications of this hypothesis in the next section, but basically, the idea is that (at least) senses that denote entities in realization, actualization or implementation relations form stable co-activation patterns. Co-activation will occur even before senses are distinguished as such, i.e., before they are used as senses of a polysemous expression. That is, suppose that there was a time when school was used just to refer to the institution. Still at that time the word school would activate the pieces that form the knowledge structure in Figure 1, as they are tightly related and interwoven. “School-institution” spreads the activation downstream to the different nodes/aspects, but, as noted, aspects also refer back to the institution and to each other.8
3 Activation packages
There seems to be a difference between how we process, store and represent the closely related senses of a polyemous word and the non-related – except perhaps historically – senses of a homonymous word (Pylkkännen et al. 2006; Klepousniotou et al. 2008; Frisson 2009). In homonymy, there is competition between meanings, such that when one is selected, the other meaning decays. There is also a strong bias towards the most frequent meaning (Frisson 2009). In contrast, the different senses of a polysemous term typically prime each other, are accessible even when one of them has been selected, and typically show little frequency effects (Pylkkännen et al. 2006; Klepousniotou et al. 2008). Many psycholinguists conclude that whereas homonymous meanings are represented in different word representations, senses of polysemous expressions are stored in one single word representation (Klepousniotou 2002; Rodd et al. 2002; Beretta et al. 2005). However, this conclusion has to be taken with a grain of salt, because there seems to be no sharp distinction between polysemy and homonymy. Empirical studies indicate that, typically, closely related senses are also more interrelated in the mental lexicon than distantly related senses. Both senses of book are plausibly stored in the same word representation, but the two senses of paper in liberal paper and shredded paper behave more like homonyms, which are stored in different representations (see Klein & Murphy 2002; Foraker & Murphy 2012). The thesis that senses of polysemous expressions are stored in one single word representation would, in principle, clearly apply to closely related senses, given that closely related senses seem to co-activate each other instead of competing for activation. However, it is still an open question how many senses would be stored in each word representation.
In this paper we are not particularly concerned with polysemy representation and storage, but with activation and co-activation of senses of polysemous expressions (which, however, is what grounds claims about representation and storage). As just mentioned, psycholinguistic evidence suggests that closely related senses form co-activation patterns. Activation of one sense by another seems to be stable and enduring: it may last for 750 ms. or more (MacGregor et al. 2015). It would be natural to think the more related the senses are, the stronger the co-activation pattern they form. That is, if two, or three, or n senses are very closely related, it will be usually the case that the activation of one of them results in a high and enduring activation of the others.
We have seen that some pairs of senses of a polysemous word do not even activate each other, but rather they behave like homonymous meanings. In addition, there is some evidence that senses that fall under “merely” regular patterns, such as count-to-mass polysemy (Frazier & Frisson 2005) or container-to-content polysemy (Schumacher 2013), may show some dominance effects. The evidence suggests that the dominant, more frequent, meaning (e.g., the container meaning) comes to mind easier than the subordinate meaning (the content meaning), and it is more difficult to switch from the dominant sense to the subordinate sense than vice versa. Thus, it may be the case that, besides distantly related senses, those senses that fall under some merely regular patterns fail to form stable and strong co-activation packages/pairs. Similarly, Klepousniotou et al. (2012) observe that senses linked by metaphorical extensions show slight frequency/dominant effects, such that the activation of the dominant sense is initially higher than the activation of the subordinate sense, although it has been reported that the activation of the subordinate sense increases over time (MacGregor et al. 2015). Pairs of senses that fall under some other regular patterns, however, seem to form activation pairs: this seems to be the case of, at least, content/container (beer) and author/works of author (Brecht) polysemies (Schumacher 2013; Weiland-Breckle & Schumacher 2017 see below for discussion).
Schumacher (2013) ran an ERP experiment using simple, non-copredicative sentences in German, comparing container/content and content/container polysemes. An example of a container/content polyseme is bottle. Bottle can be used to talk about a drinkable, as in Sarah drank the whole bottle. An example of a content/container polyseme is beer: in your beer is waiting at the bar is about a glass that contains beer. According to Schumacher’s results, beer cases show no differences in the processing profiles of content and container usages, while bottle cases exert additional computational demands (late positivities) when bottle refers to the content. These results show that the underlying operations are different in each case. In the bottle case, accessing the “drinkable” or “content” meaning seems to involve reconceptualization, whereas in the beer case, readers access the “container” meaning without difficulty. Importantly for our purposes, Schumacher (2013) notes that there is a correlation between these results and the acceptability and non-acceptability of co-predicative and anaphoric sentences. Consider the following examples:
- German (Schumacher 2013)
- ‘Pete put down the beer and drank it a few minutes later.’
- ‘Pete put down the beer and accidentally knocked it over a few minutes later.’
- German (Schumacher 2013)
- ‘Tim drank yet another glass that was mouthblown.’
- ‘Tim drank the mouthblown and sparkling glass.’
- ‘Tim drank yet another glass because it sparkled so nicely.’
In (14a) we have a co-predicative sentence in which the word Bier is primarily used to refer to the container (what is put down, in stellte das Bier) and secondly to the content (what Pete drank). In (14b) the two senses of Bier are used to refer to the container: the container is put down and knocked. Both sentences are felicitous in German and English. Yet, when we move to consider <container, content> pairs, using the word Glas, things are different: switching from one sense to the other gives non-felicitous sentences such as (15a) and (15b), in contrast with (15c), where both predicates apply to the “content” sense. According to Schumacher (2013), the activation of the “content” sense of a word like bottle involves a reconceptualization triggered by a type mismatch. Schumacher’s view is that in a sentence such as Oscar drank the bottle, the meaning “bottle: container” is turned, or coerced, into the meaning “content-of-bottle”. What has been observed about coercion is that once a meaning x is coerced into another, y, it is typically difficult to recover the original meaning x (Asher 2011). According to Schumacher (2013: 11), “access to the initial interpretation via anaphoric processing is no longer possible”. In sum, in the <container, content> pair (bottle cases) there is not mutual co-activation of both senses. In <content, container> pairs (beer cases), however, the content sense (beer) activates the container sense (glass of beer) and vice versa. In this case, there is no reconceptualization, but a simpler process that Schumacher calls “sense selection” (see also Schumacher forth.). This view seems to be supported by Schumacher’s own data about simple sentences, and is what she uses to explain the different behaviour of (14) and (15). Co-predication and anaphoric binding of different senses is felicitous when there is mutual co-activation between the senses, and is not felicitous when switching from one sense to another involves reconceptualization or coercion, and thus when one sense is deactivated in favour of the other.
Going a step further, Schumacher (2013) proposes that the asymmetry in the behaviour between the bottle and the beer cases is due to the different ontological relationships existing between the two possible denotations (container, content): whereas bottles (containers) do not require contents, beers (contents) require containers. Putting this in the terms we have used above, we can say that a bottle, which is a kind of physical object, a container, is not realized or implemented by a content; whereas something that is categorized as a drinkable, a beer, is realized or made actual by containers. Therefore, not only do Schumacher’s results support our hypothesis, her own explanation of the results is akin to ours. That is, denotations that are linked by realization relations are represented by senses that are in mutual co-activation relations, which is what ultimately explains the acceptability of co-predicative structures.
However, extant evidence is not as clear-cut as we would wish. Frisson (2015) shows that there is some processing cost in the selection of the second sense of some co-predicative sentences involving book-type polysemies. He ran two experiments: a sensicality task with 12 pairs of adjectives plus nouns constructions and an eye-movement study with 90 sentences in three different conditions (neutral condition, sense repetition and switch sense). In the first experiment, he found that switching from one sense to another was costly (compared to a repetition condition), no matter the frequency/dominance of each sense. In the subsequent eye-tracking experiment, however, he observed a longer reading time in the disambiguation area when the switch was from the less frequent sense (informational content) to the more frequent sense (physical object). The sentences in (16) exemplify the neutral condition, where there is no switch nor repetition of senses; (17) exemplify sense repetition: (17a) repeats the informational content sense of book (the dominant or more frequent sense), and (17b) the physical object sense (the subordinate or less frequent). In the sentences in (18) there is sense switching: from physical object to informational content in (18a), and from informational content to physical object in (18b):
|(16)||a.||Mary told me the book was scary and that she valued it a lot.|
|b.||Mary told me that the book was bound and she valued it a lot.|
|(17)||a.||Mary told me that the science-fiction book was scary and that she valued it a lot.|
|b.||Mary told me that the gift-wrapped book was bound and that she valued it a lot.|
|(18)||a.||Mary told me that the bound book was scary and that she valued it a lot.|
|b.||Mary told me that the scary book was bound and that she valued it a lot.|
According to Frisson’s results, in the eye-movement study, there is an observable difference between (18a) and (18b) with respect to the time that readers spend in the disambiguation region of the sentence, that is, the second time a sense is instantiated (at the adjective in bold font): readers spend longer in (18a) than in (18b), that is, when the switch is from the less frequent to the more frequent sense. In other words: there is a frequency effect, but exactly the opposite to those reported in homonymy processing.
Unlike Schumacher’s data, Frisson’s results challenge the idea that the pair of “book” senses <informational content, physical object> can form an activation package and thus explain co-predication, the idea we want to defend and Schumacher seems to propose. In an activation package, all the elements support the activation of all the others, so it is particularly worrisome that there seem to be inhibitory effects in the “book” case, which is the archetypical example of “inherent” polysemy (Pustejovsky 1995).
However, there are reasons to think that there is not any “inhibition” effect in (18), because the results show some peculiar features. To begin with, the effect seems to be stronger in the case of less frequent to more frequent sense, which is the reverse pattern observed in homonymy and distant polysemes. Secondly, the effect seems to be too shallow to be considered inhibitory: the sense targeted in the second place, required for interpretation, is still accessible. Recovering it involves some cost, but it seems to be weaker than the cost associated with recovery after reconceptualization (like in bottle cases of Schumacher 2013; forth.), where there is a process of sense construction (from bottle-the-container to bottle-as-drinkable content) that makes the original sense (bottle-the-container) decay.
There are also several things to note with respect to Frisson’s study and results. To begin with, Frisson’s examples have an adjective in a modifier position in the nominal phrase, followed by a simple copulative predication. An adjective appearing as a reference restrictor that highlights a physical property of the physical object may have the effect that the attention is directed at the physical object, giving rise to particular expectations or predictions concerning what is coming next (especially if, as Frisson suggests, identifying a book by some physical property is not frequent). Then, the switch to another sense is required to be almost immediate. There seems to be a difference between the bound book was scary and the book was bound and scary (or the delicious lunch took forever and lunch was delicious but took forever). It is possible that there is also a difference between the bound book is scary and the bound book that we saw yesterday is scary.
As noted, Frisson suggests that the effect is probably due to pragmatic reasons. So it is possible that the sense is activated, but that the expectations of the hearer make other kinds of information more salient (say, the hearer may be expecting to know more about the physical object just identified: where it has been put, how it looks, etc.). On the other hand, if there is actually a difference between the bound book is scary and the bound book that we saw yesterday is scary, such a difference might show that it matters how abrupt the transition from one sense to the other is. This may also be related to some initial pragmatic/expectation effect that may go down as processing ensues. Adding a relative clause such as that we saw yesterday, which does not refer to any intrinsic physical property of the book, may have the effect of changing the expectations/predictions. But, above all, if the sentence with the relative clause gives fewer problems in interpretation than the simple copulative one, this would show that the informational content sense was actually activated by the physical object sense. To summarize, the results Frisson reports are very interesting, but the interpretation, as he himself acknowledges, is not clear, especially regarding the difference between (18a) and (18b).
4 Activation packages and co-predication
Let us now get back to co-predication. Co-predication seems possible when all the senses involved in interpretation are active at the moment the reader or hearer has to retrieve them. Let us repeat the examples (5)–(12) from Section 1 below:
|(5)||The school that caught fire was celebrating 4th of July when the fire started.|
|(6)||The city has 500,000 inhabitants and outlawed smoking in bars last year.|
|(7)||The best university of the country has caught fire.|
|(8)||The beer Susan was drinking fell out of her hands.|
|(9)||Brazil is a large Portuguese-speaking republic that is very high in inequality rankings but always first in the FIFA ranking.|
|(10)||The nearest school, which starts at 9:00, fired some teachers and forbade hats in the classroom.|
|(11)||?The newspaper fired its editor and fell off the table.|
|(12)||?The bottle Susan was drinking fell out of her hands.|
We propose that the difference between (5)–(10) and (11)–(12), as well as many other examples of bad co-predications, lies in the level of activation that a certain sense has when the reader or hearer tries to access it. We have explained how, according to Schumacher (2013), the problem about (12) is that once “bottle-container” is reconceptualised as “bottle-content”; the sense “bottle-container” cannot be accessed because its activation goes down. The newspaper case in (11) seems to be different: the problem in this case is that “newspaper-institution” fails to activate “newspaper-printed copy” as required. Plausibly, the reason why co-predication does not work in (11) and (12) is that the two senses involved fail to enter into a co-activation relation. In contrast, (5)–(10) all involve senses that belong to activation packages.
Let’s take as an example sentence (5). When (5) is read, the word school activates (at least) the whole informational structure (Figure 1), and, with it, all the different senses. The predicate caught fire selects the aspect “building” of school. However, this does not mean that the other senses decay. They are all still active – or active enough – when the reader encounters the next predicate: was celebrating 4th of July. Thus, readers and hearers find no problem in selecting another sense that would comply with the selectional preferences of the predicate, namely, the sense “participants”, which targets teachers and students. There is, in principle, no difference between co-predication and anaphoric bindings. In both cases what is required is that readers or hearers can select the sense that they need in order to comply with the selectional preferences of the surrounding linguistic material. Therefore, our initial answer to the question (i): what makes co-predication easy for some polysemes and not for others, is that co-predication will be easy and natural if the senses form an activation package. This, we think, occurs when the senses are particularly closely related by means of realization or implementation relations (as in the case of school).
We have seen that Schumacher (2013) also suggests that co-activation of several senses will be especially robust when such senses denote entities that are in some kind of ontological dependency relations (what we have called “realization” or “implementation relations”). The most detailed model of this general account can be found in Arapinis (2013) and Arapinis & Vieu (2015). The particular question Arapinis & Vieu (2015) want to answer is different from ours. The solution to the co-predication puzzle they favour is a mereological interpretation of the “dot object” account presented in the introductory section. Arapinis & Vieu (2015) address the question of what glues together the aspects of a dot object so that aspects can actually be considered parts or constituents of a complex entity. Their response is that we have dot objects, i.e., complex entities formed by other entities that are aspects of the complex, when (a) the aspects constitute the entity, and (b) the aspects are in some coincidence relation that explains and ensures the relation of constitution between the aspects and the complex entity. In the case of institutions, for instance, the complex entity is said to be agentially co-constituted by the agents, temporally co-constituted by the norms or rules, and (optionally) physically co-constituted by the building. For the institution to be a dot object co-constituted by these three sub-entities there has to be an agential coincidence relation between them, such that agents have to be the ones that execute the rules of the organization, and the building has to be the place where the agents are when they follow those rules.
Our approach to co-predication can be seen as a way of psychologizing Arapinis & Vieu’s proposal. In our view, co-predication and anaphoric binding are particularly facilitated when senses denote entities that are linked in (roughly) the way Arapinis & Vieu propose. We do not want to commit to the specifics of their proposal, which may be problematic from a metaphysical point of view – e.g., how rules can temporally constitute an institution, or what kind of entity the institution is after all. However, we think that their approach is on the right track insofar as it is read as an attempt at finding what makes co-predication possible in typical cases. It seems that a sufficient condition for good co-predications is that the relevant senses refer to entities that are in certain metaphysical non-causal, explanatory, relations. In other words, they are relations that explain, e.g., that a school-institution is located in a building, that it has a particular organization, etc. We have called such explanatory relations “realization, actualization or implementation relations”, trying to capture the idea that, typically, the entities denoted by the senses contribute to making the kind real. Thus, a school-institution is located in a building because being so located contributes to bringing the school into existence; it involves a process because its telos requires some temporal articulation, etc. Senses will form activation packages when metaphysical links ground particularly close relationships between such senses. Overall, co-predication will be possible when senses form activation packages, and senses will form activation packages when the entities they denote are linked by explanatory, realization relations.
By establishing only a sufficient condition, this account does not exclude co-predication among senses that are not linked by realization relations. It may be the case that the widely discussed example of newspaper is a case in point (Pustejovsky 1995; Arapinis & Vieu 2015; Chatzikyriakidis & Luo 2015), although we think that even this example is amenable to the kind of explanation we have proposed. Sentences (19a) and (19b) are usually taken to be bad, while (19c) and (19d) are regarded as better or even good (Copestake & Briscoe 1995; Arapinis 2013):
|(19)||a.||?The newspaper fired its editor and fell off the table.|
|b.||?That newspaper is owned by a trust and is covered with coffee.|
|c.||John used to work for the newspaper that you are reading.|
|d.||The newspaper has been attacked by the opposition and publicly burned by demonstrators.|
There seem to be two different knowledge structures associated with newspaper: newspaper-institution (of the school kind) and newspaper-copy (of the book kind). Usually, activation in one of them does not imply activation of the other (hence, the awkwardness of (19a) and (19b). Actually, pairs of senses related to, on the one hand, newspaper as institution and, on the other, newspaper as copy, are not taken to be cases of “inherent polysemy” by Arapinis & Vieu (2015).9 Yet some such pairs are able to co-predicate, like we see in (19c) and (19d). A suggestive possibility is that co-predication works when the copy is conceived as an exemplar of a particular type of newspapers and the institution, thought as the producer of that particular product, making these co-predications akin to author-works of author co-predications, such as “Brecht was a communist writer, but is still performed in theatres all over the world”.
As mentioned above, in author-works polysemies there seems to be a strong co-activation of both senses (see Weiland-Breckle & Schumacher 2017). This fact should not be surprising, as for someone to be an author she has to have produced works – which means that thinking of someone as an author makes one think about her works, and vice versa: thinking about the works makes one think about their author. So, it is the conceptualization of Brecht as an author what makes his works salient. However, this means that only when Brecht is thought of as an author will the works-of-Brecht sense be activated. The same apparently happens with newspaper in (19): in those cases, the institution is thought of as a producer, and the copy is thought of as a product-type of that producer. In the end, this view would make these “newspaper” cases actually very similar to other cases we have touched upon, as the relationship between producer-conceived-as-such and product-conceived-as-such is certainly intimate. Perhaps it can even be said that for someone to be an author and for an institution to be a producer, they both need products, i.e., without products they would not be authors or producers, such that products make authors or producers real.10
5 Interpreting co-predication
So far, we have tried to account for when co-predication is typically possible. Now we will sketch a view as to how we interpret co-predicative sentences and, in particular, how we assign truth-conditions to co-predicative sentences. According to the view developed here, knowledge structures do not have denotations as such, but only offer possibilities for denotation, i.e., a variety of possible denotations from which the speaker has to select. That is, the denotation potential of a word type is not explained in terms other than the information stored in the knowledge structure associated with such a word-type. In this view, thus, a word is associated with a number of denotations, and a sentence with a number of contents that determine different truth-conditions. Usually, given that much of the selection of denotations is supposed to be intralinguistic, the number of different contents that a sentence can have appears to be smaller than the number of denotational possibilities associated to a single word: e.g., in I have talked to the school, as talk has selectional restrictions/preferences for animacy in both of its arguments, there are some denotations of school that are ruled out.
According to this view, it is possible to argue that sentences have contents that determine truth-conditions. This is important, since influential authors such as Chomsky (2000) and Pietroski (2005; forth.) make abundant use of examples that involve polysemy to attack truth-conditional semantics. According to them, nominal polysemy as exemplified by book, school, Brazil, etc. shows that the meaning of a word cannot be a denotation (i.e., an entity in the world). Such words seem to refer to different entities in different occasions, allegedly depending only on speaker’s intentions. In addition, and even worse than that, they apparently refer to many entities at the same time in co-predicative constructions. Therefore, it is difficult to argue that, in such cases, nominal terms provide a denotation that can be used in composing a truth-conditional meaning.
In contrast, the view presented here allows us to hold that sentential meanings determine what contents, and thus what truth-conditions, an utterance of a sentence may have. It is true that only individual utterances have some determinate, specific, truth-conditions, but, typically, they inherit their truth-conditional content from the sentence-type they are tokens of: they realize or select one of the possible contents that the sentence provides. Alternatively, it can be said that sentence meanings determine disjunctive truth-conditions. They specify a definite number of situations in which the sentence will be true. Speakers and hearers have to select among the different possibilities, so there is a role that pragmatics has to play, but the role that pragmatics plays is that of selecting among alternatives provided by a kind of knowledge that can be called “semantic” (see below for discussion).
Co-predication creates a puzzle for this kind of account, in any case. The account has it that word meanings are knowledge structures that offer possibilities for denotation by having aspects or parts that can be selected. However, what can we say about sentences such as (9)?
|(9)||Brazil is a large Portuguese-speaking republic that is very high in inequality rankings but always first in the FIFA ranking.|
In (9), Brazil is used to stand for many parts of the knowledge structure simultaneously. An option that we think should be taken seriously is to hold that co-predicational structures are shorthands of more complicated structures. Thus, (9) can be taken to be shorthand of (9’):
|(9’)||Brazil [place] is a large piece of land & Brazil [people] is Portuguese-speaking & Brazil [government] is a republic & Brazil [economic system] is very high in inequality & Brazil [football team] is always first in the FIFA rankings.|
Actually, we take it that this explicitation of truth-conditions is the best option for anybody who wants to hold that at least linguistic utterances have representational contents. Co-predication is not a specific problem for defenders of truth-conditional semantics: it is also a problem for those who, like Pietroski (2005; forth.), following Strawson (1950), want to maintain that only individual speakers refer, i.e., that reference is an individual’s act. In this case, we can say either that an individual using (9) fails to refer, or that she is ultimately expressing something like the paraphrase (9’) by some sort of previous compilation of the different senses. That is, a speaker seems to intend to refer to the different aspects that form the knowledge structure associated with Brazil, since she predicates properties that correspond to those different aspects or parts of the structure. However, the way she refers to those different parts is by using a single, compilatory, term that binds (in the psychological sense) them all. At the hearer’s end, the word form Brazil activates the package formed by all the different senses or parts of the knowledge structure. Thus, it is easy for her to establish a correspondence between the predicates and the entities such predicates ascribe properties to; senses are there, active, for her to retrieve them as she goes on processing the sentence. But then note that this kind of response is also available to someone who wants to hold that knowledge structures are word meanings, such that it is actually word meanings that provide possibilities for denotation (we take up the issue about meaning in the next section).11
Explicating the truth-conditions of a co-predicative (or anaphoric) structure in this way may not be a trivial matter. We assume that some examples will be more complicated than others. The sentence (9) is structurally simple and can be “paraphrased” by the procedure of stacking conjunctions. Initially, it may seem more difficult to provide paraphrases for a sentence such as (20):
|(20)||There is no school painted in red that is good enough (for our kids).|
However, one possible explicitation of the truth-conditions of (20) is given in (20’):
|(20’)||There is no school [building [that hosts a school-institution]] that is painted in red such that that school [institution] is good enough.|
The problem is how it can be that the particular school in the final clause refers to an institution, while it also apparently has to refer to a building. Our proposal is that the hearer or reader of (20) can easily retrieve the sense “institution”. The relative clause in (20) does not have the effect of determining that the reference of the subject of the clause is a building, precisely because school offers different possibilities of denotation, all of which are active at the moment of interpreting the relative clause. This means that when the speaker has to find a meaning for school in the relative clause she can select the institution sense, give school the semantic value: [that institution: institution located in building painted in red], and thus get the truth-conditions in (20’).
6 Knowledge structures and lexical meaning
The points just developed lead to the issue of how to conceive the kind of knowledge structures that are associated with polysemous nouns. In the Pustejovskyan tradition, such knowledge structures would be regarded as lexical meanings. Lexical meanings in this tradition are parts of world knowledge that have a relevant role in explaining systematic patterns observed in (semantic) composition: see also Jackendoff’s conceptual semantics program (e.g., Jackendoff 2002). However, a different tradition assumes that the task of lexical meaning is to explain issues such as argument realization, lexical aspect, and some other structural features of language (Pietroski forth.), having little to say about, e.g., the different ways in which a noun can be modified by an adjective. According to this latter tradition, the knowledge structures that we propose are just conceptual structures associated with lexical items. This kind of view, which we will call “thin semantics”, is also endorsed by authors working on pragmatics. In Relevance Theory, for instance, a knowledge structure such Figure 1 would be the encyclopaedic entry associated with school, whose lexical meaning could be an atomic concept or some kind of underspecific representation, which, in turn, could be a bare node, a core meaning, or a simple pointer (see Carston 2002; 2013, for discussion of these possibilities).
Now, it seems that a strong reason to adopt a thin semantics approach, at least for some researchers (Chomsky 2000; 2016), is that semantics cannot be in the business of explaining truth-conditions precisely because some sentences, such as co-predicative sentences, necessarily lack truth-conditions. According to our view, however, co-predicative sentences would have truth-conditions, provided that we reject the assumption that each NP can only have one denotation in a given sentence. The “paraphrasing” move considered above allows us to maintain that a single NP can have different denotations in a certain sentence. However, even if the Chomskyan rationale can be questioned, it is still possible to deny that knowledge structures such as the one represented in Figure 1 above embody semantic knowledge, for even if such knowledge structures play a role in assigning truth-conditions, it can be maintained that they fail to qualify as lexical meanings. Given that they incorporate a lot of world knowledge, these knowledge structures should be regarded rather as encyclopaedic entries. According to this view, the truth-conditions that we propose draw not from lexical meaning but from the conceptual knowledge that we associate with words.
As far as we can see, it is not easy to decide between this kind of view and the approach where a knowledge structure such as the one in Figure 1 captures the lexical meaning of school. Our own preferences go towards regarding Figure 1 as a representation of the lexical meaning of the word. We believe that, unless we have independent arguments against its viability, the truth-conditional approach towards semantics should be preserved (say, as a progressive research program). Besides, we also take a conservative approach towards polysemy itself, and regard it as a semantic phenomenon, i.e., a phenomenon that has to do with the meaning of words. The kind of conceptual semantics that we tend to favour, on the other hand, seems to be overall a promising enterprise: it can explain polysemy (including co-predication), but also certain compositional mechanisms, such as coercion (Zarcone 2014), lexical aspect and patterns of argument alternations (Pinker 2007), and the so-called “Travis cases” (Vicente 2012; 2015).
Our approach towards polysemy and co-predication fits most naturally within a rich meaning underspecification account of word meaning. What characterizes underspecification theories is that the interpretation of the specific sense of a word is a two-step process, in which there is some kind of meaning that has to be accessed before selecting the specific meaning of the word. Thin semantics approaches propose that such a meaning is either an abstract representation or core-meaning that applies to all the typical uses of the polysemous word (Ruhl 1989), or something even thinner, such as an index or instruction (Pietroski forth.). Rich meaning approaches, in contrast, construe that kind of meaning as a rich, organized structure where senses are explicitly represented. Our account clearly falls under the rich meanings tradition, as we hypothesize a knowledge structure that offers different possibilities of meaning or sense selection. Such a knowledge structure can be said to be the underspecific meaning of the word, while the aspects that form the structure are the specific senses that the word can have. A knowledge structure such as Figure 1, on the other hand, seems to be enough by itself to explain sense-selection. That is, there is no need to complement it with any other kind of semantic representation, be it a core-meaning, a node or an index. Moreover, it is not clear to us what role such kinds of representations could have in semantics if we allow ourselves rich meaning representations.
Rich meaning approaches can be criticized on the grounds that they do not honor the semantic knowledge vs. world knowledge distinction (though see Jackendoff 2002; for some considerations against the distinction). We acknowledge that we may have problems in this respect. However, let us close with a few words on this issue, because we think that not honouring the semantic/world knowledge distinction is a common problem to all extant explanations of polysemy. For instance, core-meaning approaches maintain that the semantic representation of a polysemous word is some kind of summary of all the different senses of the word. However, such a representation, being a core extracted from the different uses/senses of a word, is constructed after speakers have encountered such different uses/senses. Although the information it contains can be thin, the information it summarizes goes beyond a truly minimal semantic knowledge. Namely, it summarizes or captures world knowledge, albeit in a minimal way.
In turn, those theories that do not commit to a core meaning, but that consider that meaning is something like a pointer (Carston 2002) or an index or instruction for how to access the concept (Pietroski forth.) draw the distinction between semantic and world knowledge in a way that would not satisfy most defenders of such a distinction. The distinction is also usually applied to the conceptual realm: it is also traditional to think that there is a minimum that guarantees that someone is a competent user of a concept, such that mastering the concept is mastering that minimum. Therefore, these approaches would differentiate between semantic and conceptual knowledge, but, as they are, they would fail to differentiate between strictly conceptual knowledge and world knowledge.
On the other hand, our approach allows us to draw a distinction between information that forms part of a certain knowledge structure and information related to, but not contained within, such a knowledge structure. The information located in the knowledge structure tries to capture our basic knowledge about what a certain entity is, by specifying not only what class of entity it is, but also how it is made real in our world. This kind of basic knowledge is recurrently used to characterize entities that belong to kinds, and is projected onto regular linguistic uses in a way that other information is not. It is true that these knowledge structures contain a good amount of world knowledge, but we think that such knowledge has to be incorporated in one way or another (i.e., as lexical meaning or as structured conceptual representations), into any explanation of “inherent” polysemy and co-predication.
In this paper, we have proposed a rich semantics account of a particular sort of systematic or regular polysemy. The characteristic feature of this kind of polysemy is that it is possible to express more than one sense in a sentence without using the polysemous expression twice. Some authors believe that this is because the polysemous term denotes a complex entity, such that it can be said that the term is “inherently” polysemous. We have left this ontological approach to the side and have proposed a psychological account based on the notion of an activation package. According to such an account, polysemous terms that allow for co-predications are associated with knowledge structures that, in line with Pustejovkyan’s qualia structures, can be seen as explaining what a certain entity is. We have used the example of school as an illustration of our approach. The information that we have put in the knowledge structure of school is all related to how the institution is made actual (in a society such as ours). The different aspects in the structure provide possibilities of denotation and form activation packages that facilitate co-predication and anaphoric reference to different senses.