Video Coding Basic Principle - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

The differential signal is also called residual signal and at the receiver side the original

signal can be reconstructed by adding the residual and the prediction. Compared to

the original signal, the residual signal has lower correlation. Prediction coding is an

efficient tool to reduce the spatial, temporal, and set redundancy existing within or

among the video signal(s), and many prediction coding tools have been proposed.

The earliest using of prediction coding is pixel-based DPCM (differential pulse

coding modulation) coding, where the difference between neighboring two pixels

are quantized and transmitted (Cutler 1950 ). For video prediction, Harrison ( 1952 )

proposed the first representative intra-prediction method, which takes the linear com-

bination of reconstructed pixels as the prediction of the current pixel. The modified

algorithm, named LOCO-I algorithm (Weinberger et al. 2000 ), has been adopted in

JPEC-LS image compression standard. Afterwards, the AC/DC intra prediction in

transform domain (Grgi´cetal. 1997 ) and the directional intra prediction in spatial

domain (Bjontegaard 1998 ) have been proposed, and the latter becomes the prevalent

prediction method in video coding field. Many popular video coding standards adopt

the directional intra-prediction, e.g., AVC/H.264, HEVC/H.265 and AVS.

Later, the unit of prediction is extended from pixel to picture. In Seyler ( 1962 ),

a picture-based difference coding method was proposed, where only the difference

between two pictures is transmitted, and the data redundancy was reduced signifi-

cantly. Rocca first proposed block-basedmotion estimation inRocca ( 1969 ), inwhich

an arbitrary-shaped block-based motion-compensated technique was proposed. The

basic idea of Rocca's method is to model the scene as a set of constant-brightness

zones denoted with arbitrary-shaped block. These zones would move from one

frame to the next tracked by motion vectors, and difference values were transmitted

for picture reconstruction. Besides these methods, motion-compensating prediction

was further improved by employing the long-term statistical dependencies in coded

video sequence instead of only the immediately preceding frame used for prediction.

Wiegand et al. ( 1997 ) proposed a long-term memory scheme that used up to 50

previously decoded frames to determine the best motion vector. In addition, Puri

et al. ( 1990 ) first proposed the B picture concept, which interpolates any skipped

frame taking into account the movement between the two “end” frames, i.e., the

forward and backward frames. It can achieve higher compression ratio by more

effectively exploiting the correlation between reference pictures and current B pic-

ture, especially for copping with occlusion, uncovering problem caused by zooming,

nonlinear motion, and so on. The B picture is further generalized by linearly combin-

ing motion-compensated signals regardless of the reference picture selection, which

is referred to as multihypothesis motion-compensated prediction (Flierl and Girod

2003 ).

Besides reducing the redundancies within image and video data, the compression

performance is further improved by reducing the set redundancies among similar

images and videos. Karadimitriou et al. first proposed the set redundancy concept

and proposed a series set of similar image compression methods, e.g., Min-Max

differential (MMD) method (Karadimitriou and Tyler 1997 ) and centoid method

(Karadimitriou and Tyler 1998 ). The centroid method generates one central image

by averaging the pixel values in the same position among all the images, then the

Advanced Video Coding Systems

Search WWH ::

Custom Search

Home