Despite the relationship between sound and meaning in language is mostly arbitrary, there exist pockets of so-called systematicity: clusters in which particular forms recur with particular meanings.
Language is mostly arbitrary, but there are patterns of systematicity both within and across languages. As discussed previously, arbitrariness and systematicity seem to play unique roles in improving both the learnability and communicative utility of a language.
So how can we, as researchers, quantify the degree of arbitrariness and systematicity in a language? And how can we discover these trends automatically?
Previously, we established that arbitrariness is an essential part of language. It allows for greater communicative utility, and probably learnability as well – two of the main transmission biases that were hypothesized to affect the evolution of a language.
But then how do we account for the fact that there is non-arbitrariness in language?
Pretty much since its inception, one of the core principles of linguistics has been that language is arbitrary (De Saussure, 1916; Hockett, 1960). That is, there’s no apparent relationship between a sign and what it signifies; nothing inherent about the word “dog” suggests that it must refer to the DOG concept.