A local feature-based facial expression recognition system from depth video - Emerging Trends in Image Processing, Computer Vision, and Pattern Recognition

Image Processing Reference

In-Depth Information

of the FER works used principal component analysis (PCA), which is really well known for di-

mension reduction and used in many earlier works. In Padget and Cotrell [ 3 ], PCA was used

to recognize facial action units (FAUs) from the facial expression images. In Donato et al. [ 5 ]

as well as Ekman and Priesen [ 6 ] , PCA was used for FER with the facial action coding system.

Very recently, independent component analysis (ICA) has been extensively utilized for FER

based on local face image features [ 5 , 10 - 21 ] . In Bartlet et al. [ 14 ] , the authors used ICA to

extract local features and then classified several facial expressions. In Chao-Fa and Shin [ 15 ] ,

ICA was used to recognize the FAUs. Besides ICA, local binary paterns (LBP) has been used

lately for FER [ 22 - 24 ] . The main property of LBP features is their tolerance against illumin-

ation changes as well as their computational simplicity. Later on, LBP was improved by fo-

cusing on face pixel's gradient information and named as local directional pattern (LDP) to

represent local face features [ 25 ] . As like as LBP, LDP features also have the tolerance against

illumination changes but they represent much robust features than LBP due to considering the

gradient information for each pixel as aforementioned [ 25 ] .

Thus, LDP can be considered to be a robust approach and hence can be adopted for FER.

To make LDP facial expression features more robust, linear discriminant analysis (LDA) can

be applied as LDA is a strong method to be used to obtain good discrimination among the

face images from different expressions by considering linear feature spaces. Hidden Markov

model (HMM) is considered to be a robust tool to model and decode time-sequential events

[ 21 , 26 - 28 ] . Hence, HMM seems an appropriate choice to train and recognize features of difer-

ent facial expressions for FER.

For capturing face images, RGB cameras are used most widely but the faces captured

through a RGB camera cannot provide the depth of the pixels based on the far and near parts

of human face in the facial expression video where the depth information can be considered to

contribute more to extract efficient features to describe the expression more strongly. Hence,

depth videos should allow one to come up with more efficient person independent FER.

In this chapter, a novel FER approach is proposed using LDP, PCA, LDA, and HMM. Local

LDP features are first extracted from the facial expression images and further extended by

PCA and LDA. These robust features are then converted into discrete symbols using vector

quantization and then the symbols are used to model discrete HMMs of different expressions.

To compare the performance of the proposed approach, different comparison studies have

been conducted such as PCA, PCA-LDA, ICA, and ICA-LDA as feature extractor in combina-

tion with HMM. The experimental results show that the proposed method shows superiority

over the conventional approaches.

2 Depth Image Preprocessing

The images of different expressions are captured by a depth camera [ 29 ] where the camera

generates RGB and distance information (i.e., depth) simultaneously for the objects captured

by the camera. The depth video represents the range of every pixel in the scene as a gray level

intensity (i.e., the longer ranged pixels have darker and shorter ones brighter values or vice

versa). Figure 1 shows the basic steps of proposed FER system.

Search WWH ::

Custom Search

Home