In a previous post, I described how researchers might go about tackling the question of how humans understand ambiguous language. The basic idea was to first identify potential sources of disambiguating information, then ask whether humans actually use this information to understand ambiguous language.
But what constitutes a “potential” source of disambiguating information? The short answer is that such a source should have certain regularities (e.g. differences in this information-source co-vary with differences in interpretation) which could, in principle, be used to disambiguate meaning.
Of course, the kinds of regularities we expect to find depend on the kind of ambiguity. In this post, I’ll explore regularities that might benefit comprehenders contend with lexical ambiguity in particular.
Homophones (lexical ambiguity)
Homophones are words that sound the same but mean different things. Famously, the word bank could refer to a place to deposit your check, or to the land along the side of a river. Words with multiple meanings are quite common; yet despite evidence that people do activate both meanings of a given word during comprehension (Duffy et al, 1988), we often don’t seem to notice that many words could be interpreted another way. What information do comprehenders use to determine which “sense” of a word was intended?
An obvious source of information is the topic under discussion. If a speaker is talking about their financial troubles, it’s unlikely they intend “bank” to mean the side of a river. This is hard to operationalize and quantify, but in general, it’s clear that discourse context plays a role. What other forms of “context” might make a difference?
Acoustic cues: thyme vs. time
One particularly surprising––and fascinating––finding was that many homophone pairs vary in their acoustic duration. That is, speakers produce measurably distinct signals when referring to different senses of the same word, e.g. time vs. thyme.
This finding comes from Gahl (2008), who analyzed a corpus of naturally-occurring conversational speech called Switchboard (Godfrey et al, 1992). She found that for many homophone pairs, the more frequent member of the pair undergoes a kind of compression. The waveform for the word “time” is literally shorter than the waveform for the word “thyme”. In this sense, as Gahl (2008) notes in her title, time and thyme are not entirely homophonous––there are subtle, but reliable, acoustic cues that distinguish them. Critically, these frequency-based differences hold across many speakers, even after controlling for a variety of other variables, such as the local speech rate pre- and post-homophone, the contextual predictability of one sense or the other, and the syntactic category of the word.
It’s unclear (as far as I know) whether these acoustic cues are detectable by human listeners, but presumably they’re large enough to be of use to a machine. One thing I really enjoy about Gahl’s analysis is that it’s conducted on naturally-occurring speech, meaning that these differences occur “in the wild”, so to speak; and on the other hand, she managed to control a ton of other variables that are known to co-vary with the dependent variable of interest (acoustic duration), and still found an effect.
Syntactic features: Nouns vs. Verbs
Another source of potentially disambiguating information is the grammatical class of the word. If members of a homophone pair are different parts of speech––e.g. Nouns vs. Verbs––then comoprehenders should be able to quickly discriminate (or predict) which interpretation was intended. For example, if the speaker has just used a determine like the or a, then the next word is like a Noun (e.g. park as in a place to walk, vs. the act of parking one’s car).
Dautriche et al (2018) asked this question across four languages: English, Dutch, German, and French. For each language, they computed the number of homophone pairs in which the members of each pair came from different grammatical classes vs. those in which they came from the same class. In all languages, there was a high proportion (60-90%) of homophone pairs that spanned across grammatical classes. Because this could just be a coincidence––maybe all words are more likely to come from different grammatical classes, not just homophones––the authors compared these values to the distribution of values one would expect by chance using something called a permutation test.
Simply put, the authors shuffled the dataset 1000 times, and for each shuffle, recalculated the proportion of homophones occurring across vs. within syntactic categories. As depicted below, the real proportion of across-category homophones was considerably higher than what one would expect by chance.
Furthermore, according to experimental results reported in the same paper, these features appear to be both detectable and useful to language learners (Dautriche, 2018), suggesting that they provide valuable disambiguating information.
Dautriche et al (2018) also asked whether homophones are more semantically dissimilar than one would expect by chance. Semantics, or “meaning”, was operationalized by analyzing the linguistic contexts in which a word occurs––the basic premise of such an approach is that words that have similar meanings will occur in similar contexts (Firth, 1957). “Contexts”––the words that typically co-occur with a word––can be transformed into “vectors”: strings of real numbers that represent a word as a point in high-dimensional space. The semantic similarity of two words can then be approximated by computing the cosine distance between these vectors.
In this case, the authors did not find that heterographic homophones (e.g. spelled differently) were more dissimilar than one would expect by chance. This is also consistent with an analysis I’ve conducted on English homophones (using word embeddings from both Wikipedia and Google News).
Despite this, the authors found that there is still a preference against homophones being particularly similar in meaning. That is, they may not be more dissimilar than the average word pair, but it’s rare that homophones will be much more similar. The results of a follow-up analysis were not statistically significant, but the evidence was suggestive in favor of their interpretation.
The Takeaway, Open questions
There appear to be reliable cues––some of which are detectable by humans––to disambiguate different meanings of an ambiguous word. This makes sense (how else would we be able to communicate?), and offers a tentative partial solution to the question of how comprehenders contend with rampant lexical ambiguity.
Of course, there are many other kinds of ambiguity, which may present distinct challenges. Are acoustic features helpful for syntactic ambiguity? What about pragmatic ambiguity? Research suggests that they could be (Hellbernd & Sammler, 2016; Schafer et al, 2005), but doesn’t necessarily address whether such features are present in naturally-occurring conversation. And cases such as pragmatic ambiguity––e.g. indirect requests––may also require a kind of social intelligence to disambiguate, e.g. using one’s mentalizing capacity to adopt a speaker’s perspective and infer their intent (Trott & Bergen, 2018). More work is certainly needed to explore the space of which cues are available, whether humans successfully sample and deploy such cues for disambiguation, and whether they could be used to augment machine language comprehension.
More broadly, the question of why still looms large. That is, why is language ambiguous at all? It’s true that there are often cues that serve to disambiguate, but an unambiguous language system wouldn’t require such disambiguation in the first place. In other words, our “talent for disambiguation” may prevent ambiguity from being selected against, but why is ambiguity selected for? In fact, is ambiguity selected for at all, or is it a kind of evolutionary byproduct––something that slips through the cracks of language change, sometimes with useful (or costly) consequences down the line (e.g. the ability to construct puns and ambiguity-related humor).
One account of why language is selected for is that ambiguity increases the efficiency of language by allowing speakers to recycle short, easy-to-produce words (Piantadosi et al, 2012). But as Wasow (2015) has noted, this account needs to address a couple of questions. First, is it really “cognitively cheaper” to store fewer signals and have to routinely disambiguate the meaning of those signals? And second, if shorter words are preferred, why has the language system begun recycling those words before exploiting the entire space of monosyllabic words––that is, why do we reuse words before we adopt short new ones, like gub and rit? Can a word become “over-saturated” with meanings?
Moving forward, any account of why language is ambiguous needs to contend with these questions. Ultimately, we want an account that explains how ambiguous signals are generated in the first place, as well as why they seem to stick around––and when / where we should expect to see more or less ambiguity in a language.
Conwell, E., & Morgan, J. L. (2012). Is it a noun or is it a verb? Resolving the ambicategoricality problem. Language Learning and Development, 8(2), 87-112.
Conwell, E. (2015). Neural responses to category ambiguous words. Neuropsychologia, 69, 85-92.
Dautriche, I., Fibla, L., Fievet, A. C., & Christophe, A. (2018). Learning homophones in context: Easy cases are favored in the lexicon of natural languages. Cognitive psychology, 104, 83-105.
Duffy, S. A., Morris, R. K., & Rayner, K. (1988). Lexical ambiguity and fixation times in reading. Journal of memory and language, 27(4), 429-446.
Firth, J. R. (1957). Papers in Linguistics, 0xford: Oxford University Press.
Gahl, S. (2008). Time and thyme are not homophones: The effect of lemma frequency on word durations in spontaneous speech. Language, 84(3), 474-496.
Godfrey, J. J., Holliman, E. C., & McDaniel, J. (1992, March). SWITCHBOARD: Telephone speech corpus for research and development. In icassp (pp. 517-520). IEEE.
Hellbernd, N., & Sammler, D. (2016). Prosody conveys speaker’s intentions: Acoustic cues for speech act perception. Journal of Memory and Language, 88, 70-86.
Piantadosi, S. T., Tily, H., & Gibson, E. (2012). The communicative function of ambiguity in language. Cognition, 122(3), 280-291.
Schafer, A. J., Speer, S. R., Warren, P., & White, S. D. (2005). Prosodic influences on the production and comprehension of syntactic ambiguity in a game-based conversation task. Approaches to studying world-situated language use, 209-225.
Trott, S., & Bergen, B. (2018). Individual Differences in Mentalizing Capacity Predict Indirect Request Comprehension. Discourse Processes, 1-33.
Wasow, T. (2015). Ambiguity Avoidance is Overrated¹. Ambiguity: Language and communication, 29.
 Of course, there are individual differences in this tendency to notice lexical ambiguity––as I’ve mentioned before, I hypothesize that this relates to how adept a person is at crafting and understanding puns. As for what other cognitive capacities (frame-shifting, working memory, etc.) might contribute to this variability, I’m not aware of any research (as of yet) that sheds light on this question.
 Incidentally, I’ve since replicated their analysis using similar (but not identical) methods, and found qualitatively similar results, at least in English.