Graphics Reference
In-Depth Information
the following chapters that use static or dynamic facial data for face analysis, recognition, and
expression recognition.
1.1 Challenges and Taxonomy of Techniques
Capturing and processing human geometry is at the core of several applications. To work on
3D faces, one must first be able to recover their shapes. In the literature, several acquisition
techniques exist that are either dedicated to specific objects or are general. Usually accom-
panied by geometric modeling tools and post-processing of 3D entities (3D point clouds, 3D
mesh, volume, etc.), these techniques provide complete solutions for 3D full object reconstruc-
tion. The acquisition quality is mainly linked to the accuracy of recovering the z -coordinate
(called depth information). It is characterized by loyalty reconstruction, in other words, by
data quality, the density of 3D face models, details preservation (regions showing changes in
shapes), etc. Other important criteria are the acquisition time, the ease of use, and the sensor's
cost. In what follows, we report the main extrinsic and intrinsic factors which could influence
the modeling process.
Extrinsic factors . They are related to the environmental conditions of the acquisition and the
face itself. In fact, human faces are globally similar in terms of the position of main features
(eyes, mouth, nose, etc.), but can vary considerably in details across (i) their variabilities
due to facial deformations (caused by expressions and mouth opening), subject aging
(wrinkles), etc, and (ii) their specific details as skin color, scar tissue, face asymmetry, etc.
The environmental factors refer to lighting conditions (controlled or ambient) and changes
in head pose.
Intrinsic factors . They include sensor cost, its intrusiveness, manner of sensor use (cooper-
ative or not), spatial and/or temporal resolutions, measurement accuracy and the acquisition
time, which allows us to capture moving faces or simply faces in static state.
These challenges arise when acquiring static faces as well as when dealing with faces
in action. Different applications have different requirements. For instance, in the computer
graphics community, the results of performance capture should exhibit a great deal of spatial
fidelity and temporal accuracy to be an authentic reproduction of a real actors' performance.
Facial recognition systems, on the other hand, require the accurate capture of person-specific
details. The movie industry, for instance, may afford a 3D modeling pipeline system with
special purpose hardware and highly specialized sensors that require manual calibration.
When deploying a 3D acquisition system for facial recognition at airports and in train stations,
however, cost, intrusiveness, and the need of user cooperation, among others, are important
factors to consider. In ambient intelligence applications where a user-specific interface is
required, facial expression recognition from 3D sequences emerges as a research trend instead
of 2D-based techniques, which are sensitive to changes and pose variations. Here, also,
sensor cost and its capability to capture facial dynamics are important issues. Figure 1.1
shows a new 3D face modeling-guided taxonomy of existing reconstruction approaches. This
taxonomy proposes two categories: The first category targets 3D static face modeling, while
the approaches belonging to the second category try to capture facial shapes in action (i.e., in
3D+t domain). In the level below, one finds different approaches based on concepts presented
Search WWH ::




Custom Search