Gaze and Interaction

Recently, while taking care of my parents’ dog (see Figure 1 below), I observed an interesting phenomenon: whenever Sally (the dog) was thirsty, she would walk over to her empty water bowl, peer into it, then stare pointedly back at me, as if to say: what gives?

Screen Shot 2017-06-24 at 5.12.02 PM

I refilled the water bowl and she lapped it up eagerly. It struck me that this was Sally’s way of “requesting” water. No language was involved, but she successfully communicated her desire (“more water”) through an action[1], and I interpreted this action correctly and acted accordingly (filling the water bowl).

So how does this work?

The importance of nonverbal communication

Let’s back up a bit.

Language is obviously central to human communication. But within the latter half of the 20th century, researchers interested in communication began studying “nonverbal” communication. This is a very large category of behavior, including things like gesture (Ekman & Friesen, 1969), how people position themselves relative to other people (Sommer, 1959; Hall, 1962; Kendon, 1990), and even eye gaze (Butterworth & Jarrett, 1991; Johnson et al, 1998).

Research on nonverbal communication is interesting because it allows a “backdoor” into cognition and communication. We’re embodied beings, living in space, so the way we use space is important to understand our minds (Kirsh, 1995). And everyone gestures, whether or not they are aware of it. Imagine trying to give someone directions without gesturing, or explaining how to tie a tie without miming the action. Gesture and body movements also facilitate the learning and memory of motor actions (Kirsh, 2011), and even non-iconic gestures, such as beat gestures, are an important part of both the production and comprehension of speech (Holle et al, 2010).

And finally, if we’re interested in understanding how humans communicate, we need to understand the “complete system” of communication. Many animals communicate primarily through gestures, and it’s even possible that humans were gesturing long before they could talk (McNeill, 2012).

So if we want to understand how people make and understand requests, we need to look at all the actions that go into and constitute a request.

What is a “Request”?

Humans frequently make requests through language. This has been extensively documented, from requesting across cultures (Blum-Kulka et al, 1989), to the types of intentions people convey in language (Searle, 1975a), and as mentioned previously on this blog, how people make requests indirectly (Searle, 1975b; Gibbs, 1981; Curl & Drew, 2008; Pinker et al, 2008)[2].

But requests aren’t restricted to language. And, as we saw above, they also aren’t restricted to humans. Dogs make requests of humans (see above), and orangutans make requests of each other (Rossano & Liebel, 2014).

More broadly, then, a request is: some action produced by an agent intended to elicit a helpful action from another agent. This could involve the exchange of material goods (Food, water, etc.), information (“who won the game last night?”), or some sort of service (turning on the AC). Like any other part of communication, the successful communication of a request requires packaging the request in some sort of action, which the requester thinks the recipient will understand.

I don’t share a language with Sally. This means she has to perform another action she thinks I’ll understand: directed gaze.

Mobilizing a response

Eye gaze is an important part of interaction. In conversation, eye gaze helps indicate that the listener is listening (Goodwin, 1980). It also helps direct the recipient’s attention to particular information in the environment – this is why researchers are so interested in characterizing the time point at which infants begin to “gaze follow” (Johnson et al, 1998; Senju & Csibra, 2008; Butterworth et al, 1991).

Gaze also helps indicate the speaker’s intentions (such as making a request). Stivers & Rossano (2010) analyzed 336 naturally occurring requests for information in both Italian and English, and found that 61% of these requests were produced with the speaker gazing at the recipient. Requests also involved other features (e.g. 70% were done with interrogative syntax, 80% with rising intonation, etc.), though none were present in every request. Here, the authors conclude that gaze functions as a form of response mobilization. That is, looking at the addressee of a request for information makes it clear that you expect a response, and thus “mobilizes” that response.

This idea of “mobilizing response” could be especially important for indirect requests, particularly ones with declarative syntax, e.g. “my back is so sore” (Rossi, 2014). This could simply be a complaint, but when uttered in particular situations and to particular addressees (such as a significant other), it could also be a request for a back massage. As a listener, one piece of information that might help you to disambiguate between the two would be the speaker’s gaze – are they mobilizing a response or not?

But “mobilizing response” doesn’t necessarily have to involve looking at the addressee. It makes sense to look at them when you want them to answer a question, but if you’re more interested in getting them to do something (e.g. closing the window), you might want to direct their gaze to the locus of action (the window). In this way, combining both speech (“it’s getting cold in here!”) with a directed gaze (towards the window) could convey your intention.

Kelly (2001) was interested in answering this question experimentally: does directed gaze help children correctly interpret the intention of indirect speech? In a nutshell, the answer is yes. Children between 3 and 5 were presented with potential requests (e.g. “don’t forget, it’s raining”), which were either delivered in isolation without gaze or gestural information, or were paired with gaze and gestural information (e.g. looking and pointing at a raincoat). Among 4 to 5 year olds[3], children produced many more “intended actions” (e.g. grabbing the raincoat) when the speech was accompanied by nonverbal information. In other words, gaze (and pointing) was very helpful in facilitating the correct interpretation[4].

The Takeaway

So what’s the point? Why should we care about gaze or how people “mobilize responses” during interaction? I’d argue that there are at least two reasons:

First of all, gaze is an important part of interaction and communication. As mentioned above, if we want to understand how humans (or other animals) communicate, studying behaviors like gesture, gaze, etc., is essential to understand this communication system. And even if you’re not a scientist interested in this question, understanding the ways that people communicate can help you make sense of strange or awkward interactions – often, misfires in the communication system are what give rise to this sensation of awkwardness.

The second reason is more practical. Imagine you’re trying to build a robot to interact with a human. Part of these interactions will presumably involve understanding human language. But since gaze and gesture are very important to human interaction, the robot will also need to understand how particular human gaze behaviors should guide the robot’s behavior. And in fact, researchers in human-robot interaction are investigating this very question (Mumm & Mutlu, 2011; Admoni et al, 2014).

One theoretically interesting question that remains is: do humans treat robot “gaze” the same way as they treat human gaze? Or is there something different about a robot’s gaze? In other words, let’s say your pet robot needs to be recharged. Should it – like Sally – approach its charging station and then look at you expectantly? My gut says “yes”, but we’ll see what the HRI researchers come up with.


References

Ekman, P., & Friesen, W. V. (1969). The repertoire of nonverbal behavior: Categories, origins, usage, and coding. semiotica, 1(1), 49-98.

Sommer, R. (1959). Studies in personal space. Sociometry, 22(3), 247-260.

Hall, E. T. (1962). Proxemics: The study of man’s spatial relations.

Kendon, A. (1990). Spatial organization in social encounters: The F-formation system. Conducting interaction: Patterns of behavior in focused encounters.

Butterworth, G., & Jarrett, N. (1991). What Minds Have in Common is Space: Spatial Mechanisms Serving Joint Visual Attention in Infancy. British Journal of Developmental Psychology, 9(1), 55–72. http://doi.org/10.1111/j.2044-835X.1991.tb00862.x

Johnson, S., Slaughter, V., & Carey, S. (1998). Whose gaze will infants follow? The elicitation of gaze‐following in 12‐month‐olds. Developmental Science, 1(2), 233-238.

Kirsh, D. (1995). The intelligent use of space. Artificial intelligence, 73(1-2), 31-68.

Kirsh, D. (2011). How marking in dance constitutes thinking with the body.

Holle, H., Obleser, J., Rueschemeyer, S. A., & Gunter, T. C. (2010). Integration of iconic gestures and speech in left superior temporal areas boosts speech comprehension under adverse listening conditions. Neuroimage, 49(1), 875-884.

McNeill, D. (2012). How language began: Gesture and speech in human evolution. Cambridge University Press.

Blum-Kulka, S., House, J., & Kasper, G. (1989). Cross-cultural pragmatics: Requests and apologies (Vol. 31). Ablex Pub.

Searle, J. R. (1975). A taxonomy of illocutionary acts.

Searle, J. R. (1975). Indirect speech acts (pp. 59-82). na.

Gibbs, R. W. (1981). Your wish is my command: Convention and context in interpreting indirect requests. Journal of Verbal Learning and Verbal Behavior, 20(4), 431-444.

Curl, T. S., & Drew, P. (2008). Contingency and action: A comparison of two forms of requesting. Research on language and social interaction, 41(2), 129-153.

Pinker, S., Nowak, M. A., & Lee, J. J. (2008). The logic of indirect speech. Proceedings of the National Academy of sciences, 105(3), 833-838.

Rossano, F., & Liebal, K. (2014). Requests’ and ‘offers’ in orangutans and human infants. Requesting in social interaction, 333-362.

Goodwin, C. (1980). Restarts, Pauses, and the Achievement of a State of Mutual Gaze at Turn‐Beginning. Sociological inquiry, 50(3‐4), 272-302.

Senju, A., & Csibra, G. (2008). Gaze following in human infants depends on communicative signals. Current Biology, 18(9), 668-671.

Stivers, T., & Rossano, F. (2010). Mobilizing Response. Research on Language & Social Interaction, 43(1), 3–31. http://doi.org/10.1080/08351810903471258

Rossano, F. (2010). Questioning and responding in Italian. Journal of Pragmatics, 42(10), 2756–2771. http://doi.org/10.1016/j.pragma.2010.04.010

Rossi, G. (2014). The request system in Italian interaction.

Kelly, S. D. (2001). Broadening the units of analysis in communication: speech and nonverbal behaviours in pragmatic comprehension. Journal of Child Language, 28(2), 325–349. http://doi.org/10.1017/S0305000901004664

Mumm, J., & Mutlu, B. (2011). Human-robot proxemics: physical and psychological distancing in human-robot interaction. In Proceedings of the 6th international conference on Human-robot interaction (pp. 331-338). ACM.

Admoni, H., Dragan, A., Srinivasa, S. S., & Scassellati, B. (2014, March). Deliberate delays during robot-to-human handovers improve compliance with gaze communication. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction (pp. 49-56). ACM.

 


Footnotes

[1] There’s not really space to go into it here, but this situation is also interesting considering the cognitive abilities that we might expect to go into a request. To make a request action, the requester presumably has some expectation that the recipient will: a) understand this action correctly; and b) have the ability to fulfill the request.

[2] The list goes on, but I’ve got to stop somewhere.

[3] There was a difference in 3-4 year olds as well, but they produced much fewer of the “intended actions” across all conditions.

[4] Kelly argues for a kind of “nonverbal bootstrapping” of pragmatic information. That is, just as there are theories about prosodic cues helping children learn the structure of language (syntax), it’s possible that nonverbal information like gaze and gesture can help children learn to make the correct pragmatic inferences from utterances.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s