Game Development Reference
In-Depth Information
are related intuitively to the motion of some facial features under the influence
of expressions. The expression description is obtained after analyzing the spatial
distribution of the motion direction field obtained from the optical flow analysis
computed at points of high gradient values of the image of the face. This
technique gives fairly good results, although the use of optical flow needs very
stable lighting conditions and very smooth movement of head motion during the
analysis. Computationally, it is also quite heavy. From the starting research
(Yacoob & Davis, 1994) to the last published results about the performance of
the system (Black & Yacoob, 1997), improvements in the tuning of the
processing have been added to make it more robust to head rotations.
Huang and Huang (1997) introduce a system developed in two parts: facial
feature extraction (for the training-learning of expressions) and facial expression
recognition. The system applies a point distribution model and a gray-level model
to find the facial features. Then, the position variations are described by ten
Action Parameters (APs). During the training phase, given 90 different expres-
sions, the system classifies the principal components of the APs into six different
clusters. In the recognition phase, given a facial image sequence, it identifies the
facial expressions by extracting the ten APs, analyzes the principal components,
and finally calculates the AP profile correlation for a higher recognition rate. To
perform the image analysis, deformable models of the face features are fitted
onto the images. The system is only trained for faces on a frontal view.
Apparently it seems more robust to illumination conditions than the previous
approach, but they do not discuss the image processing techniques, making this
point hard to evaluate.
Pantic and Rothkrantz (2000) describe another approach, which is the core of the
Integrated System for Facial Expression Recognition (ISFER). The system finds
the contour of the features with several methods suited to each feature: snakes,
binarization, deformable models, etc., making it more efficient under uncon-
trolled conditions: irregular lighting, glasses, facial hair, etc. An NN architecture
of fuzzy classifiers is designed to analyze the complex mouth movements. In their
article, they do not present a robust solution to the non-frontal view positions.
To some extent, all systems discussed have based their description of face
actions on the Facial Action Coding System (FACS) proposed by Ekman and
Friesen (1978). The importance granted to FACS is such that two research
teams, one at the University of California, San Diego (UCSD) and the Salk
Institute, and another at the University of Pittsburgh and Carnegie Mellon
University (CMU), were challenged to develop prototype systems for automatic
recognition of spontaneous facial expressions.
The system developed by the UCSD team, described in Bartlett et al. (2001),
analyzes face features after having determined the pose of the individual in front
of the camera, although tests of their expression analysis system are only
Search WWH ::
Custom Search