Facial Expression and Gesture Analysis for Emotionally-Rich Man-Machine Interaction - 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body

Game Development Reference

In-Depth Information

Introduction

Current information processing and visualization systems are capable of offering

advanced and intuitive means of receiving input from and communicating output

to their users. As a result, Man-Machine Interaction (MMI) systems that utilize

multimodal information about their users' current emotional state are presently

at the forefront of interest of the computer vision and artificial intelligence

communities. Such interfaces give the opportunity to less technology-aware

individuals, as well as handicapped people, to use computers more efficiently

and, thus, overcome related fears and preconceptions. Besides this, most

emotion-related facial and body gestures are considered universal, in the sense

that they are recognized among different cultures. Therefore, the introduction of

an “emotional dictionary” that includes descriptions and perceived meanings of

facial expressions and body gestures, so as to help infer the likely emotional state

of a specific user, can enhance the affective nature of MMI applications (Picard,

2000).

Despite the progress in related research, our intuition of what a human

expression or emotion actually represents is still based on trying to mimic the way

the human mind works while making an effort to recognize such an emotion. This

means that even though image or video input are necessary to this task, this

process cannot come to robust results without taking into account features like

speech, hand gestures or body pose. These features provide the means to convey

messages in a much more expressive and definite manner than wording, which

can be misleading or ambiguous. While a lot of effort has been invested in

individually examining these aspects of human expression, recent research

(Cowie et al., 2001) has shown that even this approach can benefit from taking

into account multimodal information. Consider a situation where the user sits in

front of a camera-equipped computer and responds verbally to written or spoken

messages from the computer: speech analysis can indicate periods of silence

from the part of the user, thus informing the visual analysis module that it can use

related data from the mouth region, which is essentially ineffective when the user

speaks. Hand gestures and body pose provide another powerful means of

communication. Sometimes, a simple hand action, such as placing one's hands

over their ears, can pass on the message that they've had enough of what they

are hearing more expressively than any spoken phrase.

In this chapter, we present a systematic approach to analyzing emotional cues

from user facial expressions and hand gestures. In the Section “Affective

analysis in MMI,” we provide an overview of affective analysis of facial

expressions and gestures, supported by psychological studies describing emo-

tions as discrete points or areas of an “emotional space.” The sections “Facial

expression analysis” and “Gesture analysis” provide algorithms and experimen-

Search WWH ::

Custom Search

Home