Human face provides important visual cues for effective face-to-face human-
human communication. In human-computer interaction (HCI) and distant
human-human interaction, computer can use face processing techniques to esti-
mate users' states information, based on face cues extracted from video sensor.
Such states information is useful for the computer to proactively initiate appro-
priate actions. On the other hand, graphics based face animation provides an
effective solution for delivering and displaying multimedia information related
to human face. Therefore, the advance in the computational model of faces
would make human computer interaction more effective. Examples of the ap-
plications that may benefit from face processing techniques include: visual
telecommunication [Aizawa and Huang, 1995, Morishima, 1998], virtual envi-
ronments [Leung et al., 2000], and talking head representation of agents [Waters
et al., 1996, Pandzic et al., 1999].
Recently, security related issues have become major concerns in both re-
search and application domains. Video surveillance has become increasingly
critical to ensuring security. Intelligent video surveillance, which uses auto-
matic visual analysis techniques, can relieve human operators from the labor-
intensive monitoring tasks [Hampapur et al., 2003]. It would also enhance the
system capabilities for prevention and investigation of suspicious behaviors.
One important group of automatic visual analysis techniques are face process-
ing techniques, such as face detection, tracking and recognition.
2. Research Topics Overview
2.1 3D face processing framework overview
In the field of face processing, there are two research directions: analysis and
synthesis. Research issues and their applications are illustrated in Figure 1.1.
For analysis, firstly face needs to be located in input video. Then, the face image
can be used to identify who the person is. The face motion in the video can also
be tracked. The estimated motion parameters can be used for user monitoring
or emotion recognition. Besides, the face motion can also be used to as visual
features in audio-visual speech recognition, which has higher recognition rate
than audio-only recognition in noisy environments. The face motion analysis
and synthesis is an important issue of the framework. In this topic, the motions
include both rigid and non-rigid motions. Our main focus is the non-rigid
motions such as the motions caused by speech or expressions, which are more
complex and challenging. We use “facial deformation model” or “facial motion
model” to refer to non-rigid motion model, if without other clarification.
The other research direction is synthesis. First, the geometry of neutral face is
modeled from measurement of faces, such as 3D range scanner data or images.
Then, the 3D face model is deformed according to facial deformation model