Learning 3D Face Deformation Model: Methods and Applications - 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body

Game Development Reference

In-Depth Information

Figure 11. Typical animation frames. Temporal order: from left to right;

from top to bottom.

The whole animation procedure contains three steps. First, we extract audio

features from the input speech. Then, we use the trained neural networks to map

the audio features to the visual features (i.e., MUPs). Finally, we use the

estimated MUPs to animate a personalized 3D face model in iFACE. Figure 11

shows a typical animation sequence for the sentence in Figure 10.

Our real-time speech-driven animation can be used in real-time two-way

communication scenarios such as video-phone and virtual environments (Leung

et al., 2000). On the other hand, existing off-line speech-driven animation, e.g.,

Brand (1999), can be used in one-way communication scenarios, such as

broadcasting and advertising. Our approach deals with the mapping of both

vowels and consonants, thus it is more accurate than real-time approaches with

only vowel-mapping (Morishima & Harashima, 1991; Goto, Kshirsagar &

Thalmann, 2001). Compared to real-time approaches using only one neural

network for all audio features (Massaro et al., 1999; Lavagetto, 1995), our local

ANN mapping (i.e., one neural network for each audio feature cluster) is more

efficient because each ANN is much simpler. Therefore, it can be trained with

much less effort for a certain set of training data. More generally, speech-driven

animation can be used in speech and language education (Cole et al., 1999), as

a speech understanding aid for noisy environments and hard-of-hearing people

and as a rehabilitation tool for facial motion disorders.

Human Emotion Perception Study

The synthetic talking face can be evaluated by human perception study. Here,

we describe our experiments which compare the influence of the synthetic

talking face on human emotion perception with that of the real face. We did

similar experiments for 2D MU-based speech-driven animation (Hong, Wen &

Search WWH ::

Custom Search

Home