Biased language

Bias is real – and often harmful. It’s been shown to manifest in hiring decisions, in the training of machine learning algorithms, and most recently, in language itself. Three computer scientists analyzed the co-occurrence patterns of words in naturally-occurring texts (obtained from Google News), and found that these patterns seem to reflect implicit human biases.

Measuring Bias in Humans 

There are, of course, many ways that a researcher could measure bias. For example, you could simply ask participants: “What do you think about X?”, or “Rate X on a scale from 1 to 10,” or even more directly, “Are you a racist?”

The problem with this approach is that most people don’t like to think of themselves as biased, at least when it comes to how we judge other humans. And yet, bias clearly exists, even if it’s unconscious or hidden. One method for measuring such hidden biases is the Implicit Association Test (IAT), in which subjects must sort concepts (usually words, like “insect” or “flower”) into different categories (e.g. pleasant vs. unpleasant). Early studies (Greenwald et al, 1998) showed that subjects were faster at sorting words into categories when those categories somehow “matched” the sentiments about the word, and slower when the categories didn’t match. For example, it’s easier to sort “flower” into pleasant, and “insect” into unpleasant, than the other way around – unless you’re an entomologist with terrible hay fever.

After establishing that the test could measure relatively uncontroversial implicit biases, the authors demonstrated that it could also pick up on more insidious instances of prejudice. Participants were asked to sort pictures of faces into different groups (e.g. positive or negative); the faces varied by race (e.g. white vs. African-American), and participants were instructed to sort the different races into certain groups. In one condition, the instructions were to sort African-American faces into the positive group, and white faces into the negative group; in the other, the instructions were reversed.

The authors found that overall, participants were slower to sort African-American faces into the positive group than white faces (and faster to sort African-American faces into the negative group than white faces). In self-assessments, these participants did not identify as racist – but they still showed an effect of race in their sorting times.

The IAT has since been replicated and extended to many domains, with some work focusing on alleviating the effect of race via narrative (Johnson et al, 2013) or even immersive virtual reality (Peck et al, 2013) – but perhaps the most important takeaway from the original study is that it measured bias that participants were not even aware they had (hence implicit). Racism and sexism does not always have to be explicit or conscious, but these implicit biases can still manifest in important outcomes such as hiring decisions (Bertrand et al, 2005) and negotiations at a car retailer (Ayres, 1991).

So where does language come in?

The role of language

Another way that bias can manifest is in language. Previous posts on this topic have focused on how gender biases emerge subtly (or not so subtly) in the descriptions students provide of their professors – but bias can be even more implicit and insidious.

Recently, several researchers (Caliskan et al, 2017) used word embeddings, trained on Google News corpora, to test whether the biases observed in the IAT could be replicated in naturally-occurring text.

What are word embeddings?

Most broadly, a word embedding is a way of representing the meaning of a word as a vector of real numbers. There are all sorts of ways of generating these vectors, but a particularly common method is basing the vector on the linguistic contexts that a word occurs in. This allows researchers to compare the meaning of two words using their vectors, instead of relying on hand-coded dictionary definitions.

The basic assumption is that words with more similar meanings will occur in more similar contexts. For example, “apple” and “orange” might occur in similar contexts (e.g. after the words “grow”, “eat”, “peel”, etc.). Based on the similarity of their contexts, the vectors will be very similar – thus, they have similar “meanings”.

A variety of cool applications have used word embeddings with great success (some discussed in more detail here), but the key takeaway is that the cosine distance between two word vectors is meant to be an approximation of how similar/different those words are in meaning.

Bias in word embeddings

You might be wondering exactly how word embeddings relate to implicit biases.

Caliskan et al (2017) developed the Word Embedding Association Test (WEAT), essentially a way of measuring the similarity between two words, based on their word embeddings. Their hypothesis was that if implicit bias shows up in naturally-occurring text, WEAT scores should predict (or otherwise approximate) the implicit biases found in humans using the IAT (described above).

Previous work (Bolukbasi et al, 2016) had already established the existence of bias in word embeddings. As the title of their 2016 paper indicates, word embeddings trained on natural corpora came to “learn” biases, such as equating the word “man” with the word “engineer”, and the word “woman” with the word “homemaker”. But the contribution of Caliskan et al (2017) was to demonstrate that these word embeddings actually approximate measurements taken of actual implicit human biases! This suggests that the biases observed in humans manifest not only in lab-based experiments like the IAT, but also in the way that people use language.

Why does this matter?

There are at least two  ominous implications of this finding:

The first implication is theoretical. An algorithm trained on naturally-occurring text acquired similar biases as the ones observed in actual humans. This means that implicit biases can be learned simply through exposure to innocuous things like language. Even before we factor in the importance of institutionalized bias – and especially explicit racism or sexism – it appears that a machine, at least, can come to exhibit human-like biases through statistical learning.

This also has implications for industrial applications of Artificial Intelligence. There’s been considerable press recently about how a lot of machine learning algorithms are biased, because they are trained using biased data; this affects speech and accent recognition, face recognition, recidivism predictions and sentencing, and much, much more. The finding discussed above mean that a machine using word embeddings trained on Google News corpora will probably learn similar biases that people have. Depending on the application, this might actually have significant ramifications for society – especially as such algorithms become more and more prevalent and central to our decision-making.

Fortunately, there is work being done on “de-biasing” machine learning algorithms (Bolukbasi et al, 2016). But as one might expect, such work is difficult, particularly since the biases these algorithms are learning are entrenched in our society – right down to the very words we use. This raises a pretty obvious point: in addition to de-biasing our algorithms, we should also be working on de-biasing ourselves and our society.


Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. (1998). Measuring individual differences in implicit cognition: the implicit association test. Journal of personality and social psychology, 74(6), 1464.

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.

Johnson, D. R., Jasper, D. M., Griffin, S., & Huffman, B. L. (2013). Reading narrative fiction reduces Arab-Muslim prejudice and offers a safe haven from intergroup anxiety. Social cognition, 31(5), 578-598.

Peck, T. C., Seinfeld, S., Aglioti, S. M., & Slater, M. (2013). Putting yourself in the skin of a black avatar reduces implicit racial bias. Consciousness and Cognition, 22(3), 779–787.

Bertrand, M., Chugh, D., & Mullainathan, S. (2005). Implicit discrimination. American Economic Review, 94-98.

Ayres, I. (1991). Fair driving: Gender and race discrimination in retail car negotiations. Harvard Law Review, 817-872.

Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349-4357).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s