Scene Video Coding - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

2002 ). Semantic coding aims at describing video using high-level descriptions. It is

mainly developed for faces where parameter sets like action units or facial animation

parameters are used to animate a face. Thus, it is expected that semantic coding

can achieve far more efficient performance than the other video coders since motion

and deformation of most objects are limited compared with the possible variations

in an array of pixels required to display the object. As is known, the maximum

entropy of the video sequence and the required bit rate are determined by the num-

ber of possible variations in the video representation. If we consider changing the

facial expression from neutral to joy, a semantic coder with face model just needs

to transmit the command 'smile' and the decoder would know how to deform the

corresponding face model to make it smile. And some techniques that enable track-

ing of local motion of important facial features (eyes open-close, mouth open-close)

are proposed. They are therefore possible to determine not only the spatial location

of the investigated facial feature, but also its shape (Antoszczyszyn et al. 1998 ). To

estimate facial animation parameters from monocular video sequences, the authors

in Eisert and Girod ( 1997 ) proposed an approach which is model based and uses a 3D

model specifying shape and texture of a head. Moreover, the surface of this model

is built by triangular B-splines to simplify the modeling of facial expressions. In

Bojkovic and Milovanovic ( 2001 ), some coding methods for bit rate reduction of

FAPs, which make the transmission of multiple talking heads over band-limited

channels possible, are reviewed and the relationships between natural/synthetic audio

video coding from the integration of face animation with natural video point of view

are discussed.

Generalized perceptual coding may cover all of above video coding methods, as

the human perception has already been considered as a factor in the existing coding

framework. In Jayant et al. ( 1993 ), gave a formal definition of perceptual coding,

which is defined as “a coding algorithm that is based on the criterion minimizing

the perceived error.” So in perception coding, the perception modeling is the kernel,

which aims to simulate the processes of the human visual system. Actually, we still

know little about HVS, only some properties may be modeled approximately. These

known properties may be summarized as: luminance to contrast conversion, channel

decomposition, CSF (Contrast Sensitivity Function), and masking. Most HVS-based

models simulate the above processes, and try to sequence them in the way they occur

in the HVS. One of the well-known multichannel models for image quality assess-

ment known as the Visible Difference Predictor (VDP) algorithms was proposed by

Daly ( 1992 ). In Jayant et al. ( 1993 ), also gave an overview of the perceptual coders

for image and video, including Just Noticeable Distortion (JND) based image coder

(Safranek and Johnston 1989 ; Chou and Li 1995 ) and perceptual weighted quan-

tization coding (Watson 1993 ). The recent researches on JND-based coder can be

found in Yang et al. ( 2003 ), Ma et al. ( 2011 ), which are usually implemented onto

the traditional JPEG, MPEG coders with better performance.

Moreover, texture analysis and synthesis coding, as a cross research of perceptual

coding and segmentation coding, aroused the interests of researchers (Ndjiki-Nya

et al. 2003 ; Zhang et al. 2010 ). Besides the coding algorithms, the quality assessment

is an important issue for perceptual coding. Obviously, the objective PSNR is not

Advanced Video Coding Systems

Search WWH ::

Custom Search

Home