Game Development Reference
In-Depth Information
Morishima presents in his work. Goto et al. do this by using just a frontal and side
view picture of the individual, whereas Morishima also includes other views to
recover texture on self occlusions. Models are animated using MPEG-4 FAPs
to allow for compatibility with other telecom systems. Animation parameters are
extracted from video input of the frontal view face of the speaker and then
synthesized, either on the cloned head model or on a different one. Speech
processing is also utilized to generate more accurate mouth shapes. An interest-
ing post-processing step is added. If the analysis results do not reflect coherent
anatomical motion, they are rejected and the system searches in a probability
database for the most probable motion solution to the incoherence. In Goto,
Escher and Magnenat-Thalmann (1999), the authors give a more detailed
explanation about the image processing involved. Feature motion models for
eyes, eyebrows, and mouth allow them to extract image parameters in the form
of 2D point displacements. These displacements represent the change of the
feature from the neutral position to the instant of the analysis and are easily
converted into FAPs. Although the system presents possibilities to achieve face
cloning, the current level of animation analysis only permits instant motion
replication with little precision. We consider that face cloning is not guaranteed
even if realistic animation is.
Also aiming at telecom applications, Andrés del Valle and Dugelay (2002) have
developed a system that takes advantage of robust face feature analysis
techniques, as well as the synthesis of the realistic clone of the individual being
analyzed. We can consider their approach a hybrid between the methods
discussed in this section and those that will be presented in the next one. They
use a Kalman filter to recover the head global position and orientation. The data
predicted by the filter allows them to synthesize a highly realistic 3D model of the
speaker with the same scale, position and orientation of the individual being
recorded. These data are also useful to complement and adapt feature analysis
Figure 8. In the approach proposed by Andrés del Valle and Dugelay, the
avatar does not only reproduce the common techniques that non-rigid,
action feature-based analysis would permit, but also synthesizes the rigid
motion, thanks to the use of Kalman filtering during pose prediction.
Images courtesy of the Image Group at the Institut Eurécom.
Search WWH ::




Custom Search