Techniques for Face Motion & Expression Analysis on Monocular Images - 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body

Game Development Reference

In-Depth Information

To control the optical flow data generated from the analysis continuous frames,

Tang and Huang (1994) project the head model wireframe vertices onto the

images and search for the 2D motion vectors only around these vertices. The

model they animate is very simple and the 2D motion vectors are directly

translated into 2D vertex motion. No 3D action is generated.

Almost the same procedure is used by Sarris and Strintzis (2001, 2002) in their

system for video-phoning for the hearing impaired. The rigid head motion (pose)

is obtained by fitting the projection of a 3D wireframe onto the image being

analyzed. Then, non-rigid face movements (expressions) are estimated thanks

to a feature-based approach adapted from the Kanade, Lucas and Tomasi

algorithm. The KLT algorithm is based on minimizing the sum of squared

intensity differences between a past and a current feature window, which is

performed using a Newton-Raphson minimization method. The features to track

are some of the projected points of the wireframe, the MPEG-4 FDPs. To derive

MPEG-4 FAPs from this system, they add to the KLT algorithm the information

about the degrees of freedom of motion (one or several directions) that the

combination of the possible FAPs allows on the studied feature FDPs.

Ahlberg (2002) also exposes in his work a wireframe fitting technique to obtain

the rigid head motion. He uses the new parameterized variant of the face model

CANDIDE, named CANDIDE-3, which is MPEG-4 compliant. The image

analysis techniques include PCA on eigentextures that permits the analysis of

more specific features that control the model deformation parameters.

More detailed feature point tracking is developed by Chou et al. (2001). They

track the projected points belonging to the mouth, eyes and nostrils provided.

These models are also based on the physical vertex distribution of MPEG-4's

FDPs and they are able to obtain the combination of FAPs that regenerate the

expression and motion of the analyzed face. Their complete system also deals

with audio input, analyzing it and complementing the animation data for the lips.

The main goal of their approach is to achieve real time analysis to employ these

techniques in teleconferencing applications. They do not directly obtain the pose

parameters to also synthetically reproduce the pose of the head, but they

experiment on how to extend their analysis to head poses other than a frontal

view face, by roughly estimating the head pose from the image analysis and

rectifying the original input image.

The MIRALab research team at the University of Geneva (Switzerland) has

developed a complete system to animate avatars in a realistic way, in order to

use them for telecommunications. In Goto et al. (2001), they review the entire

process to generate customized realistic animation. The goal of their system is

to clone face behavior. The first step in the overall process is to physically adapt

a generic head mesh model (already susceptible to being animated) to the shape

of the person to be represented. In essence, they follow the same procedure that

Search WWH ::

Custom Search

Home