Information Technology Reference
In-Depth Information
Facial feature representation in dynamic image sequences is critical to facial expres-
sion recognition. Generally, two sorts of features can be extracted, i.e. the geometric
features versus the appearance based features. Geometric features, often extracted from
the shape and locations of facial components, are concatenated by feature vectors to rep-
resent the face geometry. A typical geometric feature extraction procedure can be state
as follows: automatically detecting the approximate location of facial feature points in
the initial frame, then manually adjusting the points, and finally tracking the changes
of all points in the next frame. Most studies focused on how to detect and track motion
of facial components based on lips, eyes, brows, cheek through building a geometric
model. For example, Tian et al. [4] proposed multi-state models to extract the geomet-
ric facial features for detecting and tracking the changes of facial components in near
frontal face images. Kobayashi et al. [5] proposed a geometric face model described by
30 facial feature points to this purpose. Appearance features represent texture changes
of skin in the face, such as wrinkles and furrows. Some techniques, such as Gabor
wavelet representation [6], optical flow [7], independent component analysis (ICA) [8],
and local feature analysis (LFA) [9], are widely used to extract the facial appearance
features. For example, Kotsia et al. [10] proposed a grid-tracking and deformation sys-
tem based on deformation models and tracking the grid in consecutive video frames
over time. Donato et al. [11] compared the above techniques on analyzing facial ac-
tions of the upper and lower face in image sequences. Feng et al. [12] used local binary
patterns on small facial regions for describing facial features. However, the major lim-
itation of the geometric features is that they may be sensitive to shape and resolution
variations, whereas the appearance features may contain redundant information.
Some researches combine both geometric and appearance features for designing au-
tomatic facial expression recognition to overcome the limitation of the geometric and
the appearance based features. For example, Lanitis et al. [13] used the active appear-
ance models (AAM) to interprete the face images. Yesin et al. [14] proposed a method
to extract positions of the eyes, eyebrows and the mouth, for determining the cheek and
forehead regions, and then apply the optical flow on these regions, and finally feed the
resulting vertical optical flow values to the discrete Hopfield network for recognizing
expressions. Recent studies [15,16] have shown that the combination of geometric and
appearance based features can achieve excellent performance in face recognition with
robustness to some problems caused by pose motion and partial occlusion. However,
these methods are only based on static images, rather than dynamic image sequences.
Therefore, we limit our attention to the extension of the these methods to the dynamic
image sequence. To this end, we propose a framework for detecting facial interest points
based on the active shape model (ASM) [17] and then extracting the spatiotemporal fea-
tures from the region components centered at these facial interest points for dynamic
image sequences. Moreover, to reduce the feature dimensionality and select the more
discriminative features, the AdaBoost method [18] is used for building robust learning
models and for boosting our component-based approach.
The classifier design is another important issue in facial expression recognition. Most
of the facial expression recognition approaches use only one classifier. Some stud-
ies [19,20] have shown that combining the output of several classifiers will lead to an
improved classification performance, because each classifier makes errors on a different
Search WWH ::




Custom Search