Techniques for Face Motion & Expression Analysis on Monocular Images - 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body

Game Development Reference

In-Depth Information

are related intuitively to the motion of some facial features under the influence

of expressions. The expression description is obtained after analyzing the spatial

distribution of the motion direction field obtained from the optical flow analysis

computed at points of high gradient values of the image of the face. This

technique gives fairly good results, although the use of optical flow needs very

stable lighting conditions and very smooth movement of head motion during the

analysis. Computationally, it is also quite heavy. From the starting research

(Yacoob & Davis, 1994) to the last published results about the performance of

the system (Black & Yacoob, 1997), improvements in the tuning of the

processing have been added to make it more robust to head rotations.

Huang and Huang (1997) introduce a system developed in two parts: facial

feature extraction (for the training-learning of expressions) and facial expression

recognition. The system applies a point distribution model and a gray-level model

to find the facial features. Then, the position variations are described by ten

Action Parameters (APs). During the training phase, given 90 different expres-

sions, the system classifies the principal components of the APs into six different

clusters. In the recognition phase, given a facial image sequence, it identifies the

facial expressions by extracting the ten APs, analyzes the principal components,

and finally calculates the AP profile correlation for a higher recognition rate. To

perform the image analysis, deformable models of the face features are fitted

onto the images. The system is only trained for faces on a frontal view.

Apparently it seems more robust to illumination conditions than the previous

approach, but they do not discuss the image processing techniques, making this

point hard to evaluate.

Pantic and Rothkrantz (2000) describe another approach, which is the core of the

Integrated System for Facial Expression Recognition (ISFER). The system finds

the contour of the features with several methods suited to each feature: snakes,

binarization, deformable models, etc., making it more efficient under uncon-

trolled conditions: irregular lighting, glasses, facial hair, etc. An NN architecture

of fuzzy classifiers is designed to analyze the complex mouth movements. In their

article, they do not present a robust solution to the non-frontal view positions.

To some extent, all systems discussed have based their description of face

actions on the Facial Action Coding System (FACS) proposed by Ekman and

Friesen (1978). The importance granted to FACS is such that two research

teams, one at the University of California, San Diego (UCSD) and the Salk

Institute, and another at the University of Pittsburgh and Carnegie Mellon

University (CMU), were challenged to develop prototype systems for automatic

recognition of spontaneous facial expressions.

The system developed by the UCSD team, described in Bartlett et al. (2001),

analyzes face features after having determined the pose of the individual in front

of the camera, although tests of their expression analysis system are only

Search WWH ::

Custom Search

Home