Game Development Reference
In-Depth Information
Introduction
Current information processing and visualization systems are capable of offering
advanced and intuitive means of receiving input from and communicating output
to their users. As a result, Man-Machine Interaction (MMI) systems that utilize
multimodal information about their users' current emotional state are presently
at the forefront of interest of the computer vision and artificial intelligence
communities. Such interfaces give the opportunity to less technology-aware
individuals, as well as handicapped people, to use computers more efficiently
and, thus, overcome related fears and preconceptions. Besides this, most
emotion-related facial and body gestures are considered universal, in the sense
that they are recognized among different cultures. Therefore, the introduction of
an “emotional dictionary” that includes descriptions and perceived meanings of
facial expressions and body gestures, so as to help infer the likely emotional state
of a specific user, can enhance the affective nature of MMI applications (Picard,
2000).
Despite the progress in related research, our intuition of what a human
expression or emotion actually represents is still based on trying to mimic the way
the human mind works while making an effort to recognize such an emotion. This
means that even though image or video input are necessary to this task, this
process cannot come to robust results without taking into account features like
speech, hand gestures or body pose. These features provide the means to convey
messages in a much more expressive and definite manner than wording, which
can be misleading or ambiguous. While a lot of effort has been invested in
individually examining these aspects of human expression, recent research
(Cowie et al., 2001) has shown that even this approach can benefit from taking
into account multimodal information. Consider a situation where the user sits in
front of a camera-equipped computer and responds verbally to written or spoken
messages from the computer: speech analysis can indicate periods of silence
from the part of the user, thus informing the visual analysis module that it can use
related data from the mouth region, which is essentially ineffective when the user
speaks. Hand gestures and body pose provide another powerful means of
communication. Sometimes, a simple hand action, such as placing one's hands
over their ears, can pass on the message that they've had enough of what they
are hearing more expressively than any spoken phrase.
In this chapter, we present a systematic approach to analyzing emotional cues
from user facial expressions and hand gestures. In the Section “Affective
analysis in MMI,” we provide an overview of affective analysis of facial
expressions and gestures, supported by psychological studies describing emo-
tions as discrete points or areas of an “emotional space.” The sections “Facial
expression analysis” and “Gesture analysis” provide algorithms and experimen-
Search WWH ::




Custom Search