Game Development Reference
In-Depth Information
To control the optical flow data generated from the analysis continuous frames,
Tang and Huang (1994) project the head model wireframe vertices onto the
images and search for the 2D motion vectors only around these vertices. The
model they animate is very simple and the 2D motion vectors are directly
translated into 2D vertex motion. No 3D action is generated.
Almost the same procedure is used by Sarris and Strintzis (2001, 2002) in their
system for video-phoning for the hearing impaired. The rigid head motion (pose)
is obtained by fitting the projection of a 3D wireframe onto the image being
analyzed. Then, non-rigid face movements (expressions) are estimated thanks
to a feature-based approach adapted from the Kanade, Lucas and Tomasi
algorithm. The KLT algorithm is based on minimizing the sum of squared
intensity differences between a past and a current feature window, which is
performed using a Newton-Raphson minimization method. The features to track
are some of the projected points of the wireframe, the MPEG-4 FDPs. To derive
MPEG-4 FAPs from this system, they add to the KLT algorithm the information
about the degrees of freedom of motion (one or several directions) that the
combination of the possible FAPs allows on the studied feature FDPs.
Ahlberg (2002) also exposes in his work a wireframe fitting technique to obtain
the rigid head motion. He uses the new parameterized variant of the face model
CANDIDE, named CANDIDE-3, which is MPEG-4 compliant. The image
analysis techniques include PCA on eigentextures that permits the analysis of
more specific features that control the model deformation parameters.
More detailed feature point tracking is developed by Chou et al. (2001). They
track the projected points belonging to the mouth, eyes and nostrils provided.
These models are also based on the physical vertex distribution of MPEG-4's
FDPs and they are able to obtain the combination of FAPs that regenerate the
expression and motion of the analyzed face. Their complete system also deals
with audio input, analyzing it and complementing the animation data for the lips.
The main goal of their approach is to achieve real time analysis to employ these
techniques in teleconferencing applications. They do not directly obtain the pose
parameters to also synthetically reproduce the pose of the head, but they
experiment on how to extend their analysis to head poses other than a frontal
view face, by roughly estimating the head pose from the image analysis and
rectifying the original input image.
The MIRALab research team at the University of Geneva (Switzerland) has
developed a complete system to animate avatars in a realistic way, in order to
use them for telecommunications. In Goto et al. (2001), they review the entire
process to generate customized realistic animation. The goal of their system is
to clone face behavior. The first step in the overall process is to physically adapt
a generic head mesh model (already susceptible to being animated) to the shape
of the person to be represented. In essence, they follow the same procedure that
Search WWH ::




Custom Search