Face Localization - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

The system described by Maio and Maltoni [149] consists of three stages. The

first stage approximately locates all elliptical objects in a directional image using a

generalized Hough transformation [13]. The second stage improves the localization

accuracy by using a local optimization of the ellipse's position and size. Finally, the

third stage checks whether the objects found are faces by comparing vertical and

horizontal projections of directions to a face model.

Many face localization techniques rely on skin color. Terrillon et al. [225] pro-

vide a comparative study on the utility of different chrominance spaces for this task.

Motion information is also useful for face detection. Differences between consec-

utive frames are likely to be large at the boundaries of moving objects. Spatio-

temporal contour filters can also be applied to extract object boundaries. Another

example for the use of motion is the approach taken by Lee et al. [136]. They com-

pute the optical flow, segment moving face regions using a line-clustering algorithm,

and use ellipse fitting to complete the extraction of the face region.

Color and motion features are strong hints for the presence of a face. However,

these low-level features are not always available. Furthermore, each low-level fea-

ture is likely to be ambiguous since a variety of non-face objects, potentially present

in the analyzed images, can trigger them as well. Thus, it may be necessary to use

higher-level features.

An example of a face localization method that employs the relative positioning

of facial features is the one proposed by Jeng et al. [110]. They initially try to es-

tablish possible eye locations in binarized images. For each possible eye pair, the

algorithm goes on to search for a nose, a mouth, and eyebrows. The system de-

scribed by Yow and Cipolla [248] employs a probabilistic model of typical facial

feature constellations to localize faces in a bottom-up manner.

Unlike the face models described above, active shape models depict the actual

physical appearance of features. Once released within close proximity to a face, an

active shape model will interact with local image features (edges, brightness) and

gradually deform to take the shape of the face by minimizing an energy function.

Several methods use snakes, first introduced by Kass et al. [120]. Cootes et

al. [45] recently proposed the use of a generic flexible model which they called ac-

tive appearance model. It contains a statistical model of the shape and gray-level ap-

pearance of the object of interest which can generalize to almost any valid example.

During a training phase the relationship between model parameter displacements

and residual errors induced between a training image and a synthesized example are

learned.

In contrast to feature-based methods, image-based approaches handle face de-

tection as a pattern recognition problem. They analyze an image window that has

been normalized to a standard size and then classify the presence or absence of a

face. Linear subspace methods apply linear statistical tools, like principal compo-

nent analysis (PCA), linear discriminant analysis (LDA), and factor analysis (FA) to

model facial images. For example, Moghaddam and Pentland [160] proposed a face

detection technique based on a distance measure from an eigenface model.

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home