Graphics Reference
In-Depth Information
dard [MPEG4, 1997]. In these approaches, the human face geometry is charac-
terized with a 3D mesh model. The facial motion is parameterized as rotation
and translation for rigid motion, and action unit or facial muscle weights for
non-rigid facial motions. These parameters together with the video background
can be transmitted over channel at very low bit rate, and the video can be re-
constructed via synthesis of the facial area based on the transmitted parameters.
However, currently there are no completely model-based available coder yet,
because it is difficult to extract these facial geometry and motion parameters
from video automatically and robustly. Furthermore, the residual of the model-
based coding is not transmitted in many approaches. Therefore, the differences
between the original video and reconstructed video could be arbitrarily large.
Eisert et al. [Eisert et al., 2000] propose a hybrid coding technique using a
model-based 3D facial motion tracking algorithm. In this approach, the model-
based coding results and waveform-based coding results are compared and best
results are used. In this way, the two coding schemes can complement each
other. In this topic, we propose a model-based face video coder in similar spirit.
Nonetheless, the proposed very low bit rate face coding method is efficient and
robust because of our 3D face tracking.
We first locate the face in the video frame. Next, the generic facial geometric
model is adapted to the face, and facial texture for the model is extracted from
the first frame of the video. The facial motion is then tracked and synthesized.
The residual error in face area and video background are then coded with state-
of-the-art waveform based coder. Finally the facial motion parameters, coded
residual error and video background are transmitted at very low bit rate. Exper-
iments show that our method can achieve better PSNR around facial area than
the state-of-the-art waveform-based video coder at about the same low bit rate.
Moreover, our proposed face video coder has better subjective visual effects.
1.2 Model-based face video coder
The face video is first sent to a face tracker that extracts face motion pa-
rameters. Then a face synthesizer synthesize a face appearance based on the
motion parameters. After the synthesized face is obtained, the residual error
can be calculated by subtracting it from the original frame. The video frame is
divided into foreground residual and background region. For the background
and foreground residual, since we do not assume prior knowledge about it we
can employ the advantage of state-of-the-art waveform-based coder to do the
coding. The advantage of this approach is that the facial motion details not
captured by the geometric motion parameters will not be lost when the video is
coded at the low bit rate.
The chosen waveform-based video coder is JVT reference software JM 4.2
obtained from [H26L, 2002]. It is a joint effort between ISO MPEG and ITU
H.26x after the successful H.264 development, and represents the state-of-the-
Search WWH ::




Custom Search