Graphics Reference
In-Depth Information
face. To provide a visual representation of the scalar field, an automatic labeling scheme is
applied: The warm colors (red, yellow) are associated with high DSF( F ref ,
F t ) values and cor-
respond to facial regions affected by high deformations, whereas the cold colors are associated
with regions that remain stable from one frame to another. Thus, this dense deformation filed
summarizes the temporal changes of the facial surface when a particular facial expression is
conveyed.
According to this representation, the deformation of each subsequence is captured by the
mean DSF
V k
α
defined in Equation 5.10. Because the dimensionality of the feature vector is
high, we use LDA-based transformation to map the present feature space to an optimal one
that is relatively insensitive to different subjects while preserving the discriminative expression
information. LDA defines the within-class matrix S w and the between-class matrix S b .It
transforms a n -dimensional feature to an optimized d -dimensional feature where d
n .For
our experiments, the discriminative classes are the 6 expressions, thus the reduced dimension
d is 5.
For the classification, we used the multiclass version of the random forest algorithm. The
random forest algorithm was proposed by Breiman (2001) and defined as a meta-learner
comprising many individual trees. It was designed to operate quickly over large data sets and
more importantly to be diverse by using random samples to build each tree in the forest. A
tree achieves highly nonlinear mappings by splitting the original problem into smaller ones,
solvable with simple predictors. Each node in the tree consists of a test, whose result directs a
data sample towards the left or the right child. During training, the tests are chosen to group
the training data in clusters where simple models achieve good predictions. Such models are
stored at the leaves, computed from the annotated data, which reached each leaf at train time.
Once trained, random forest classifies a new expression from an input feature vector by putting
it down each of the trees in the forest. Each tree gives a classification decision by voting for
that class. Then, the forest chooses the classification having the most votes (over all the trees
in the forest).
<
Experimental Results
Deformations following facial expressions across 3D video sequences are characterized by
subtle variations induced mainly by the motion of facial points. These subtle changes are
important to perform effective expression recognition, but they are also difficult to be analyzed
because of the face movements. To handle this problem, as described in the previous section,
we propose a curve-based parametrization of the face that consists in representing the facial
surface by a set of radial curves. According to this representation, the problem of comparing
two facial surfaces, a reference facial surface and a target one, is reduced to the computation
of the DSFs between them.
To make possible starting the recognition process from any frame of a given video, we
considered subsequences of n frames. Thus, we chose the first n frames as the first subse-
quence. Then, we chose n -consecutive frames starting from the second frame as the second
subsequence. This process was repeated by shifting the starting index of the sequence every
one frame till the end of the sequence.
Following the experimental protocol proposed in Sun and Yin (2008), a large set of sub-
sequences were extracted from the original expression sequences using a sliding window.
Search WWH ::




Custom Search