Game Development Reference
In-Depth Information
2002 ). Semantic coding aims at describing video using high-level descriptions. It is
mainly developed for faces where parameter sets like action units or facial animation
parameters are used to animate a face. Thus, it is expected that semantic coding
can achieve far more efficient performance than the other video coders since motion
and deformation of most objects are limited compared with the possible variations
in an array of pixels required to display the object. As is known, the maximum
entropy of the video sequence and the required bit rate are determined by the num-
ber of possible variations in the video representation. If we consider changing the
facial expression from neutral to joy, a semantic coder with face model just needs
to transmit the command 'smile' and the decoder would know how to deform the
corresponding face model to make it smile. And some techniques that enable track-
ing of local motion of important facial features (eyes open-close, mouth open-close)
are proposed. They are therefore possible to determine not only the spatial location
of the investigated facial feature, but also its shape (Antoszczyszyn et al. 1998 ). To
estimate facial animation parameters from monocular video sequences, the authors
in Eisert and Girod ( 1997 ) proposed an approach which is model based and uses a 3D
model specifying shape and texture of a head. Moreover, the surface of this model
is built by triangular B-splines to simplify the modeling of facial expressions. In
Bojkovic and Milovanovic ( 2001 ), some coding methods for bit rate reduction of
FAPs, which make the transmission of multiple talking heads over band-limited
channels possible, are reviewed and the relationships between natural/synthetic audio
video coding from the integration of face animation with natural video point of view
are discussed.
Generalized perceptual coding may cover all of above video coding methods, as
the human perception has already been considered as a factor in the existing coding
framework. In Jayant et al. ( 1993 ), gave a formal definition of perceptual coding,
which is defined as “a coding algorithm that is based on the criterion minimizing
the perceived error.” So in perception coding, the perception modeling is the kernel,
which aims to simulate the processes of the human visual system. Actually, we still
know little about HVS, only some properties may be modeled approximately. These
known properties may be summarized as: luminance to contrast conversion, channel
decomposition, CSF (Contrast Sensitivity Function), and masking. Most HVS-based
models simulate the above processes, and try to sequence them in the way they occur
in the HVS. One of the well-known multichannel models for image quality assess-
ment known as the Visible Difference Predictor (VDP) algorithms was proposed by
Daly ( 1992 ). In Jayant et al. ( 1993 ), also gave an overview of the perceptual coders
for image and video, including Just Noticeable Distortion (JND) based image coder
(Safranek and Johnston 1989 ; Chou and Li 1995 ) and perceptual weighted quan-
tization coding (Watson 1993 ). The recent researches on JND-based coder can be
found in Yang et al. ( 2003 ), Ma et al. ( 2011 ), which are usually implemented onto
the traditional JPEG, MPEG coders with better performance.
Moreover, texture analysis and synthesis coding, as a cross research of perceptual
coding and segmentation coding, aroused the interests of researchers (Ndjiki-Nya
et al. 2003 ; Zhang et al. 2010 ). Besides the coding algorithms, the quality assessment
is an important issue for perceptual coding. Obviously, the objective PSNR is not
Search WWH ::




Custom Search