INTRODUCTION - 3D Face Processing: Modeling, Analysis and Synthesis

Graphics Reference

In-Depth Information

of facial motions which may facilitate semantic analysis, psychologists have

proposed Facial Action Coding System (FACS) [Ekman and Friesen, 1977].

FACS is based on anatomical studies on facial muscular activity and it enumer-

ates all Action Units (AUs) of a face that cause facial movements. Currently,

FACS is widely used as the underlying visual representation for facial motion

analysis, coding, and animation. The Action Units, however, lack quantita-

tive definition and temporal description. Therefore, computer scientists usually

need to decide their own definition in their computational models of AUs [Tao

and Huang, 1999]. Because of the high complexity of natural non-rigid facial

motion, these models usually need extensive manual adjustments to achieve

realistic results.

Recently, there have been considerable advances in motion capture technol-

ogy. It is now possible to collect large amount of real human motion data.

For example, the Motion Analysis TM system [MotionAnalysis, 2002] uses

multiple high speed cameras to track 3D movement of reflective markers. The

motion data can be used in movies, video game, industrial measurement, and

research in movement analysis. Because of the increasingly available motion

capture data, people begin to apply machine learning techniques to learn motion

model from the data. This type of models would capture the characteristics of

real human motion. One example is the linear subspace models of facial mo-

tion learned in [Kshirsagar et al., 2001, Hong et al., 2001b, Reveret and Essa,

2001]. In these models, arbitrary face deformation can be approximated by a

linear combination of the learn basis.

In this topic, we present our 3D facial deformation models derived from

motion capture data. Principal component analysis (PCA) [Jolliffe, 1986] is

applied to extract a few basis whose linear combinations explain the major vari-

ations in the motion capture data. We call these basis Motion Units (MUs), in a

similar spirit to AUs. Compared to AUs, MUs are derived automatically from

motion capture data such that it avoids the labor-intensive manual work for de-

signing AUs. Moreover, MUs has smaller reconstruction error than AUs when

linear combinations are used to approximate arbitrary facial shapes. Based on

MUs, we have developed a 3D non-rigid face tracking system. The subspace

spanned by MUs is used to constrain the noisy image motion estimation, such

as optical flow. As a result, the estimated non-rigid can be more robust. We

demonstrate the efficacy of the tracking system in model-based very low bit-rate

face video coding. The linear combinations of MUs can also be used to deform

3D face surface for face animations. In iFACE system, we have developed text-

driven face animation and speech-driven animations. Both of them use MUs

as the underlying representation of face deformation. One particular type of

animation is real-time speech-driven face animation, which is useful for real-

time two-way communications such as teleconferencing. We have used MUs

as the visual representation to learn a audio-to-visual mapping. The mapping

Search WWH ::

Custom Search

Home