Scene Video Coding - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

segmentations may be identified as some objects for further operation, e.g., coding

and tracking (Musmann et al. 1989 ; Salembier et al. 1997 ), which is called object-

based coding.

The fundamental difference between block and object-based coding schemes lies

in that the object-based coding structure concentrates on analyzing and synthesizing

the objects in an image (Salembier et al. 1997 ), which brings several advantages over

block-oriented schemes, e.g., adaptation to the local image characteristics and object

motion compensation as opposed to blockwise motion compensation. Therefore, by

transmitting the object information, the conventional coding artifacts induced by

block-based coding scheme such as blocking and ringing can be avoided, and low

bit rate with high quality compression can be achieved. A number of remarkable

research activities on object-oriented image and video data compression have been

reported inHötter ( 1990 ), Hotter ( 1994 ), Gerken ( 1994 ), Ostermann ( 1994 ). A typical

object-based video coding aims at efficiently coding specific objects, e.g., a human

face. This kind of video sequence usually consists of stationary background scene

and dynamic foreground scene, such as telephony video, which is also called as

head and shoulder video sequence. Head and shoulder video compression has wide

applications in video conferencing and video phone, which requires low bit rate

video compression to represent the dynamic scene. To achieve higher efficiency

compression of audiovisual information with relatively low bit rate, MPEG started

to develop the international standard MPEG-4. In a word, object-based coding is one

of the most important milestones in the developments of video coding. It symbolizes

that the content aware coding technique comes to be a core technology in the model-

based coding framework and as important as statistical prediction/transform coding

methods, and it also inspires the following knowledge, semantic-based coding and

perceptual coding, etc.

In a sense, knowledge-based video coding is a kind of expansion of object-based

video coding, which encodes the object based on the knowledge of known objects,

e.g., face and body, etc. Generally speaking, the coder tries to recognize objects like

faces in a video sequence using the image/video analysis system, and then conducts

efficient coding by the corresponding prior knowledge about objects to be coded.

Take face coding as an example. First, the coder detects human face automatically

and identifies the locations of human face in video. As soon as the coder recognizes

the face, the coder switches the generic object-based mode to the knowledge-based

mode, i.e., face mode. Then, by incorporating face modeling into coding system,

the coders only need to encode the required analysis parameters of human face,

thereby saving bit rate significantly. The decoder can obtain human face by the

coded parameters using the same face modeling system. Therefore, the key is how

to model human face effectively. Many works are proposed to design algorithms for

face modeling efficiently and accurately (Aizawa et al. 1989 ; Tekalp and Ostermann

2000 ).

Different from the conventional video coding algorithms that encode each frame

of a video sequence using joint information of the current image signal and a pre-

dicted image signal, semantic video coding describes a video sequence using model

objects with behavior that represent real objects with their behavior (Wang et al.

Advanced Video Coding Systems

Search WWH ::

Custom Search

Home