Human language is full of ambiguity. Most people are familiar with homophones––words that sound the same, but have different meanings––such as bank (e.g. the bank of a river, vs. a place to deposit your money. But ambiguity cuts across multiple levels of language, from inflectional morphemes (–s can mark a plural noun, a 3rd-person singular … Continue reading Why is language ambiguous? (Ambiguity in language, pt. 1)
When you return from a trip, others invariably ask what the best parts of the trip were. This a reasonable question, I think. While it’s obviously impossible to convey the full breadth of your subjective experience, it should be possible to distill several key “highlights”, which can come in the form of either concrete events … Continue reading Japan: General Reflections
This post is a departure from my usual post format. Instead of walking through a theoretical topic or recent academic paper, this is intended to be a soft introduction to using Latent Semantic Analysis (LSA) to categorize documents. It's essentially an extension to the existing tutorial in sklearn, found here. I'll be using nlp_utilities for the walkthrough. … Continue reading [Tutorial] Document classification with Latent Semantic Analysis (LSA)
Although the relationship between sound and meaning in language is mostly arbitrary, there exist pockets of so-called systematicity: clusters in which particular forms recur with particular meanings. One example of systematicity is the existence of phonaesthemes. Phonaesthemes are recurring patterns of sound and meaning that occur below the morphemic level, which is traditionally considered the … Continue reading Discovering phonaesthemes (Arbitrariness in language, pt. 4)
The advent of digital media has made artistic content more widely accessible than ever before. For the most part, we can find any song, film, or TV show within minutes. Paradoxically, however, this can have a paralyzing effect: the digital media landscape is massive and ever-changing, and finding the content we want requires approaching this … Continue reading Recommender systems and sampling
Popular culture often depicts intelligent machines as coldly rational––capable of making “objective” decisions that humans can’t. More recently, however, there’s been increased attention to the presence of bias in supposedly objective systems, from image recognition to models of human language. Often, these biases instantiate actual human prejudices, as described in Cathy O’Neill’s Weapons of Math … Continue reading What we talk about when we talk about bias in A.I.
(Note: This work was conducted with Robert Loughnan of the UCSD Cognitive Science Department.) The role of the news media is ostensibly to inform. In order to do this, however, the media must present information in a relatively unbiased way. If citizens obtain information about the world primarily through the media, and the media presents … Continue reading Bias in the News