Human language is full of ambiguity. Most people are familiar with homophones––words that sound the same, but have different meanings––such as bank (e.g. the bank of a river, vs. a place to deposit your money. But ambiguity cuts across multiple levels of language, from inflectional morphemes (–s can mark a plural noun, a 3rd-person singular verb in the present tense, or a possessive) to syntactic structures (e.g. he saw the man with the telescope).
Intuitively, it seems like the prevalence of this ambiguity ought to make communication difficult; the more interpretations any given word (or syntactic unit, utterance, etc.) has, the higher the probability that a listener will misinterpret the intended meaning. Indeed, some (Chomsky, 2002) have cited ambiguity as possible evidence language may not have evolved for communication at all.
Others, however, argue that ambiguity actually serves a communicative function––in other words, that it’s a feature, not a bug. Below, I’ll summarize some of their theoretical arguments, then describe a recent empirical study that sheds light on which parts of a lexicon are ambiguous, and why.
Ambiguity in Language: Theoretical Foundations
The most well-known argument was initially put forth by Zipf (1949), and goes like this: languages evolve to meet the demands of their speakers, but speakers and listeners have competing demands. According to Zipf, both speakers and listeners want to “minimize their effort”. In the case of speakers, he argued that an optimal language would consist of a single word, e.g. ba. To express any meaning, all a speaker need say is “ba”. And in the case of listeners, an optimal language would map every meaning to a distinct form, to avoid the need to infer which meaning the speaker intended. Together, these competing demands––which Zipf terms unification and diversification, respectively––produce a language with “some, but not total, ambiguity” (Piantadosi et al, 2012; pg. 3).
One obvious limitation to this argument, as pointed out in Wasow et al (2005) and Piantadosi et al (2012), is that a totally ambiguous language does not truly minimize a speaker’s effort––after all, if a listener misinterprets what the speaker meant, the speaker has to spend additional effort clarifying what they meant.
Thus, Piantadosi et al (2012) posit a slightly different set of trade-offs: rather than unification and diversification, language evolves to satisfy the competing demands of clarity (signals in which the intended meaning has a high probability of being correctly derived) and ease (signals which are easy to produce and process). “Easy” signals include those which are short, frequent, and phonotactically well-formed––but since there is a limited number of these signals in any given language, ease is sometimes sacrificed in the pursuit of clarity. Here, Piantadosi et al (2012) cite the NATO phonetic alphabet (“alpha” for a, “bravo” for b, etc.) as an example; in order to avoid confusion, monosyllabic letter names are replaced with bisyllabic words . Similarly, clarity is sometimes sacrificed for ease, as in the case of referential pronouns, which are ambiguous, but usually short and easy to produce (E.g. “he”).
Piantadosi et al (2012) then argue that these competing pressures produce communication systems that are optimized for efficiency:
First, apparently ambiguous signals (e.g. those in which clarity is sacrificed) are almost always unambiguous in context. For example, in the case of “run”, which can be either a noun or verb, the meaning is often disambiguated by the preceding word, e.g. “a run” or “we run”. Thus, language permits ambiguity because the intended meaning is usually made clear by the surrounding context of use, so that there is no need to provide additional information in the signal itself.
Second, ambiguity means that particularly easy signals (E.g. short, frequent, and well-formed) can be repurposed for multiple meanings. This benefits speakers and listeners alike, in that their lexicon will contain words that are easier to produce and process.
The theory outlined by Piantadosi et al (2012) does seem intuitive, and others (Levinson, 2000) have made similar arguments. But is it accurate? Fortunately, like all good theories must, it makes specific, testable predictions: if one benefit of ambiguity is the recycling of “easy” linguistic units, then linguistic units that are easier to produce and process should have more meanings associated with them. In other words, these expressions should act as “attractors”, acquiring multiple meanings because of their convenience and ease.
Based on psycholinguistic evidence, we know there are several variables that strongly affect ease of lexical processing and production: frequency (more frequent words are easier), length (shorter words are easier), and phonotactic well-formedness (how much a word conforms to the phonotactic rules of a language). These variables, then, are the predictors.
Piantadosi et al (2012) operationalized ambiguity as the number of meanings that a given word (or syllable) has. In the first analysis, this was measured as the number of homophones: the number of words with distinct, unrelated meanings. In the second analysis, this was measured as the number of senses of a word; this included homophones, but also words with polysemous relationships (E.g. run as in “the train between Boston and New York” vs. “John runs to the store”). Finally, in the third analysis, the authors asked whether certain syllables appeared in more words; following the same logic, “easier” syllables should be recycled more often, in more distinct words.
Across English, Dutch, and German, various operationalizations of ease systematically and significantly predicted the ambiguity of a word (or syllable). That is, words which were shorter and/or more frequent tended to attract more meanings.
This finding is consistent with the predictions of the theory outlined above. The languages surveyed all contain lexical ambiguity, but lexical ambiguity is most concentrated in regions of the lexicon that theoretically should be easier to produce and process.
At first glance, the prevalence of ambiguity in human language seems suboptimal. But Piantadosi et al (2012) suggest that ambiguity is a feature, not a bug: it makes language more “efficient” by recycling easier words in favor of longer, more difficult-to-produce words. And because the meaning of any given word should be clear in context, this lexical ambiguity doesn’t incur undue costs on either speakers or listeners.
Of course, there are a few limitations to this theory. First, it is primarily concerned with lexical ambiguity. Even if the theory explains why lexicons contain ambiguity, it is unclear whether it extends to cases of syntactic or pragmatic ambiguity––both of which might exist for different reasons, and exhibit different patterns in language use.
Second, the theory assumes that disambiguation is not inordinately expensive for listeners. This assumption is not exclusive to Piantadosi et al (2012); as pointed out in the paper, Levinson (2000) argues that languages will typically minimize the effort involved in articulation, and rely more on listener inference, as inference is “cognitively cheap”. But even if it is generally true, it does suggest an avenue for further exploration: if certain kinds of ambiguity do require expensive inferences to resolve, one would expect a language to minimize those kinds of ambiguity, or to have well-developed mechanisms for repairing misinterpretations.
A related assumption is that ambiguous words will generally be clear in context. In order for disambiguation to be “cheap”, and for lexical ambiguity to not result in costly miscommunications, a language must have contextual cues (linguistic, situational, etc.) available to make the intended meaning clear. Piantadosi et al (2012) argue that if context is informative at all about meaning, then a word will necessarily be less ambiguous when coupled with a context than without it––so a well-designed communication system will allow context to provide meaning, and won’t build in redundancies into the meanings of particular words. As the authors point out, this argument is difficult to test: “It is unclear how one might test the first, information-theoretic argument, since it is a mathematical demonstration that ambiguity should exist; it does not make predictions about language other than the presence of ambiguity” (Piantadosi et al, 2012; pg. 9).
This last assumption is intuitive, and perhaps seems trivially true––of course words are less ambiguous in context––but again, it opens the door for interesting questions. The most interesting of these questions to me is: given that context is informative about meaning, which contextual cues are used for which kinds of ambiguity?
Ambiguity lives at the intersection of questions about language structure and evolution, as well as constraints on language processing and production, and even what it means to mean something.
Chomsky, N. (2002). An interview on minimalism. N. Chomsky, On Nature and Language, 92–161.
Levinson, S. (2000). Presumptive meanings: The theory of generalized conversational implicature. MIT Press.
Piantadosi, S. T., Tily, H., & Gibson, E. (2012). The communicative function of ambiguity in language. Cognition, 122(3), 280–291. https://doi.org/10.1016/j.cognition.2011.10.004
Wasow, T., Perfors, A., & Beaver, D. (2005). The Puzzle of Ambiguity. Morphology and the Web of Grammar: Essays in Memory of Steven G. Lapointe, 1–18.
Zipf, G. (1949). Human behavior and the principle of least effort. New York: Addison-Wesley.
 The use of language for communication might turn out to be a kind of epiphemenon…If you want to make sure that we never misunderstand one another, for that purpose language is not well-designed, because you have such properties as ambiguity” (Chomsky, 2002, p107), as cited in Piantadosi et al (2012).
 Zipf is probably most famous for Zipf’s law, a finding relating the actual frequency of words to their frequency by rank. Specifically, the frequency of a word is inversely proportional to its frequency rank. In practice, this means that the most frequent word will occur approximately twice as frequently as the second most frequent word, and so on.
 In the field of Conversation Analysis, this process is called interactive repair, and is a widely-used mechanism for resolving “troubles” in communication.
 They formalize this argument using information theory: assuming that context––which they define to include all linguistic and extra-linguistic information readily available to the listener––is both “known and informative”, the Shannon entropy of a signal (M), no matter how ambiguous, will necessarily be less when it is conditioned on context (C) than when heard in isolation. In other words, H[M] > H[M|C].
 Phonotactic probability did not significantly predict the number of homophones in English, though it did in both German and Dutch.