Graphics Reference
In-Depth Information
land, 1997]. In these approaches, first, the movements of the vertices on the
model are calculated using optical flow. The optical flow results are usually
noisy. The facial deformation model is then added to constrain the noisy 2D
image motion. The motion parameters are calculated by least square estimator.
However, FACS was originally proposed for psychology study and does not
provide quantitative information about facial deformations. To utilize FACS,
researchers need to manually design the parameters of their model to obtain
the AUs. This manual design process is usually labor-intensive. Li et al. [Li
et al., 1993] uses a parametric geometrical face model, called Candide. The
Candide model contains a set of parameters for controlling facial shape. Tao
and Huang [Tao and Huang, 1999] use a piecewise Bezier volume deformable
face model, which can be deformed by changing the coordinates of the control
vertices of the Bezier volumes. In [Essa and Pentland, 1997], Essa and Pentland
extended a mesh face model, which was developed by Platt and Badler [Platt
and Badler, 1981], into a topologically invariant physics-based model by adding
anatomy-based “muscles,” which is defined by FACS.
1.3
Statistical models
1.3.1
Active Shape Model (ASM) and Active Appearance Model
(AAM)
Active Shape model (ASM) [Cootes et al., 1995], Active Appearance model
(AAM) [Cootes et al., 1998], utilize variations of both contour and appearance
to model the facial motion. They are both analysis-by-synthesis approaches.
ASM and AAM try to achieve robust performance by using the high-level
model to constrain solutions to be valid examples of the object being tracked.
The appearance of the object is explained by the high-level model as a compact
set of model parameters. The models used by ASM and AAM are the eigen-
features of the object. ASM models the shape variation of a set of landmark
points and the texture variation in the local areas around landmark points. AAM
models the contour and the appearance inside of the contour of the object. Both
of them require manually labelled training data, which is labor intensive. The
training data need to be carefully labelled so that the correspondences between
the landmarks across training samples are physically correctly established. In
order to handle various lighting conditions, the texture part of the training data
should cover broad enough lighting conditions.
1.3.2
3D model learned from motion capture data
People have recently proposed to train facial motion subspace models from
real facial motion data [Basu et al., 1998, Hong et al., 2001b, Kshirsagar et al.,
2001, Reveret and Essa, 2001], which can capture the real motion character-
istics of facial features better than manually defined models. The approaches
Search WWH ::




Custom Search