Facial expressions
 Kismet home
 Overview
 The robot
 THE FRAMEWORK
 Socializing
 Ongoing research
 Broader questions
 People
 Video
 Publications
 Media

 FACIAL EXPRESSION
 Visual attention
 Ocular-motor control
 Low-level features
 Expressive speech
 Affective intent
 Regulation
 Behavior
 Emotions



Facial Behaviors



Watch clip: Readable expressions 
 (get viewer) 

The human face is the most complex and versatile of all species. For humans, the face is a rich and versatile instrument serving many different functions. It serves as a window to display one's own motivational state. This makes one's behavior more predictable and understandable to others and improves communication. The face can be used to supplement verbal communication. A quick facial display can reveal the speaker's attitude about the information being conveyed. Alternatively, the face can be used to complement verbal communication, such as lifting of the eyebrows to lend additional emphasis to a stressed word. Facial gestures can communicate information on their own, such as a facial shrug to express "I don't know" to another's query. The face can serve a regulatory function to modulate the pace of verbal exchange by providing turn-taking cues. The face serves biological functions as well -- closing one's eyes to protect them from a threatening stimulus, and on a longer time scale to sleep.


Kismet doesn't engage in adult-level discourse, but its face serves many of these functions at a simpler, pre-linguistic level. Consequently, the robot's facial behavior is fairly complex. The above schematic shows the facial motor control system. Kismet's face currently supports four different functions, and must do so in a timely, coherent, and appropriate manner. It reflects the state of the robot's emotion system. We call these emotive expressions. It conveys social cues during social interactions with people. We call these expressive facial displays. It participates in behavioral responses (such as closing its eyes to protect them from a dangerous stimulus). It also synchronizes with the robot's speech. The face system must be quite versatile as the manner in which these four functions are manifest changes dynamically with motivational state and environmental factors.

However, people seem to be the most captivated by Kismet's emotive facial expressions. Consequently, we will focus on Kismet's expressive abilities here.

Emotive Facial Expressions



Watch clip: Examples of expressions 
 (get viewer) 

When designing the expressive abilities of a robot face, it is important to consider both the believability and readability of the facial behavior. Believability refers to how life-like the behavior appears. Readability refers to how well the observer can correctly interpret the intended expression. Kismet's face is always in motion, which greatly enhances its life-like quality. Great attention has been paid not only to how the face is configured to express a particular ``emotional" state, but also to the transitions between these states.

accepting tired soothed content joy calm sorrow stern disgust alert unhappy fear surprise anger
Move mouse over points in affect space (above) to see Kismet's corresponding expressions.
Kismet's facial expressions are generated using an interpolation-based technique over a three dimensional space. The three dimensions correspond to arousal, valence, and stance. These same three attributes are used to affectively assess the myriad of environmental and internal factors that contribute to Kismet's "emotional" state. We call the space defined by the [A, V, S] trio the affect space. The current affective state occupies a single point in this space at a time. As the robot's affective state changes, this point moves about within this space. Note that this space not only maps to emotional states (i.e., anger, fear, sadness, etc.) but also to the level of arousal as well (i.e., excitement and fatigue). A range of expressions generated with this technique is shown above. The procedure runs in real-time, which is critical for social interaction.



click for enlarged view

There are nine basis (or prototype) postures that collectively span this space of emotive expressions. Although some of these postures adjust specific facial features more strongly than the others, each prototype influences most if not all of the facial features to some degree. For instance, the valence prototypes have the strongest influence on lip curvature, but can also adjust the positions of the ears, eyelids, eyebrows, and jaw. The basis set of facial postures has been designed so that a specific location in affect space specifies the relative contributions of the prototype postures in order to produce a net facial expression that faithfully corresponds to the active "emotion". With this scheme, Kismet displays expressions that intuitively map to the emotions of anger, disgust, fear, happiness, sorrow, and surprise. Different levels of arousal can be expressed as well from interest, to calm, to weariness.

There are several advantages to generating the robot's facial expression from this affect space. First, this technique allows the robot's facial expression to reflect the nuance of the underlying assessment. Hence, even through there is a discrete number of "emotions", the expressive behavior spans a continuous space. Second, it lends clarity to the facial expression since the robot can only be in a single affective state at a time (by our choice), and hence can only express a single state at a time. Third, the robot's internal dynamics are designed to promote smooth trajectories through affect space. This gives the observer a lot of information as to how the robot's affective state is changing, which makes the robot's facial behavior more interesting. Furthermore, by having the face mirror this trajectory, the observer has immediate feedback as to how their behavior is influencing the robot's internal state.

Video:  Facial expressions
In this video clip, Kismet displays a series of facial expressions.
   Quicktime (15 fps) -- (10.7 Meg)
   Quicktime (15 fps) -- (14.1 Meg)
   Quicktime (15 fps) -- (14.1 Meg)
   Author: C. Breazeal
   Length: approximately 45 seconds

(image courtesy of P. Menzel)


Other topics
Kismet's hardware
Visual attention
Ocular-motor control
Low-level features
Expressive speech
Affective intent in speech
Homeostatic regulation mechanisms
The behavior system
Emotions



         

    contact information: cynthia@ai.mit.edu