Face Localization - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

10. Face Localization

One of the major tasks in human-computer interface applications, such as face

recognition and video-telephony, is the exact localization of a face in an image.

In this chapter, I use the Neural Abstraction Pyramid architecture to solve this

problem, even in presence of complex backgrounds, difficult lighting, and noise.

The network is trained using a database of gray-scale still images to reproduce man-

ually determined eye coordinates. It is able to generate reliable and accurate eye

coordinates for unknown images by iteratively refining an initial solution.

The performance of the proposed approach is evaluated against a large test set.

It is also shown that a moving face can be tracked. The fast network update allows

for real-time operation.

10.1 Introduction to Face Localization

To make the interface between humans and computers more pleasant, computers

must adapt to the users. One prerequisite for adaptation is that the computer per-

ceives the user. An important step for many human-computer interface applications,

like face recognition, lip reading, reading of the users emotional state, and video-

telephony, is the localization of the user's face in a captured image. This is a task

humans can perform very well, without perceiving effort, while current computer

vision systems have difficulties.

An extensive body of literature exists for face detection and localization prob-

lems. Recently, Hjelmas and Low [99] published a survey on automatic face detec-

tion methods. They distinguish between feature-based and image-based methods.

Feature-based methods are further classified as either relying on low-level fea-

tures, such as edges, motion and skin color, as searching for higher-level features,

such as a pair of eyes, or as using active shape models, such as snakes or deformable

templates.

An example of a feature-based method that uses edges is the approach taken

by Govindaraju [83], where edges are extracted, labeled, and matched against a

predefined face model. A similar system is described by Jesorsky et al. [111]. It

consists of an edge extraction stage, a coarse localization that uses a face model, a

fine localization that relies on an eye model, as well as a multi-layer perceptron for

the exact localization of the pupils.

Search WWH ::

Custom Search

Home