Applications - 3D Face Modeling, Analysis and Recognition

Graphics Reference

In-Depth Information

face recognition. In (Mayo and Zhang, 2009), SIFT keypoints detected on multiple 2D depth

images have been used to perform 3D face recognition. SIFT descriptors computed on a

sampling grid of points in 2D depth images have been used in (Ohbuchi and Furuya, 2004)

for 3D object retrieval by visual similarity. Finally, SIFT descriptors have been also used in

(Zheng et al., 2009) to perform 2D expression recognition from non-frontal face images.

Grounding on these studies, in the following we discuss an approach that uses local descrip-

tors of the face to perform person independent 3D facial expression recognition. This approach

has been originally proposed in (Berretti et al., 2010e), (Berretti et al., 2010a), and subsequently

developed to a completely automatic solution that exploits the local characteristics of the face

around a set of facial keypoints automatically detected (Berretti et al., 2011a). In this solu-

tion, some facial landmarks are first identified, and then SIFT descriptors computed at these

landmarks are combined together as feature vector representing the face. A feature selection

approach is then applied to these vectors inorder to extract the subset of most relevant fea-

tures, and the selected features are finally classified using SVMs. As it emerges from the

experimental evaluation, this approach is capable to achieve state of the art results on the

BU-3DFE database just relying on few keypoints that are automatically detected and without

using neutral scans as reference.

In the rest of this section, we will first briefly provide a solution for the automatic identi-

fication of facial keypoints. Then we will address the adaptation of SIFT descriptors to the

proposed case and the feature selection approach used to reduce the set of SIFT features and

the SVM based classification of the selected features. We will also provide a summary of the

results obtained with this approach.

Automatic Identification of Facial Keypoints

The BU-3DFE database is the standard benchmark to compare 3D facial expression recognition

algorithms (see Section 5.2). However, the fact that this database provides a set of manually

identified landmarks, and the inherent difficulty in automatically detecting the majority of

these landmarks has oriented the research towards semi-automatic solutions for 3D facial

expression recognition as illustrated in Section 5.4.2. In semi-automatic solutions, the position

of facial landmarks is assumed to be known to achieve high facial expression recognition

rates (see Section 5.4.1), but this hinders the applicability of these solutions to the general

case in which manual annotation of the landmarks in 3D is not available or even possible. To

overcome this limitation, in Berretti et al. (2011a) a completely automatic solution to identify

fiducial points of the face is proposed, which is shortly reviewed in the following paragraphs.

As first pre-processing step, the 3D face scans were transformed to depth images where

the gray-value of each image pixel represents the depth of the corresponding point on the 3D

surface. As an example, Figure 5.15 shows the depth images derived from the 3D face scans

of a same subject under three different facial expressions.

On the depth images, the point with maximum gray value has been used as initial estimate

of the tip of the nose. This point was used to crop a rectangular region of the face (following

anthropometric statistical measures (Farkas, 1994), the cropped region extends 50 mm on the

left and 50 mm on the right of the nose tip, and 70 mm above and 50 mm below the nose tip).

The cropped region of the face is used for all the subsequent steps of the processing.

Search WWH ::

Custom Search

Home