Introduction - Advanced Video Coding Systems

Game Development Reference

In-Depth Information

Based on these redundancies, lots of efficient coding techniques are proposed to

compress data by reducing them. The main coding tools include prediction, transfor-

mation, quantization, and entropy coding. The predictive coding technique exploits

video signal correlations both in the spatial and temporal domains to predict the

current coding video signals from the coded ones, which can efficiently reduce the

quantity of information to be coded. In order to reduce spatial redundancy, intrapre-

diction methods are proposed to use the neighboring coded signals to predict the cur-

rent ones, and only the prediction residuals are coded into the bitstream. For example,

the DC differential coding method in JEPG (Wallace 1992 ), which encodes the DC

difference of neighboring blocks instead of encoding the DC value directly, takes

advantage of the similarity of average block brightness. The well-known LOCO-I

algorithm (Weinberger et al. 2000 ) adopted in JPEC-LS, reduces much more redun-

dancy by shortening the prediction distance, which takes the linear combination

of neighboring reconstructed pixels as the prediction of the current pixel. In video

coding, directional intraprediction methods are widely used to predict signals in cur-

rent blocks along with high correlated directions, e.g., intraprediction in AVC/H.264

(Wiegand et al. 2003 ).

For temporal redundancies in videos, they are usually removed by exploiting the

similarity between neighboring frames and constructing a prediction for the current

frame. The prediction may be formed from one or more previous or future frames

and is improved by block-based motion estimation, which searches for the best

prediction block in neighboring frames within a given range to deal with relative

motions between objects and camera. For some situations, e.g., surveillance videos,

the background of the videos is usually stable, and only the foreground changes fast.

Therefore, a long-term reference is selected or generated by background modeling

from frames decoded several seconds earlier to reduce the background redundancy

(Wiegand et al. 1999 ), while the recently decoded reference frames (called short-term

reference) reduce the foreground redundancy.

Besides the prediction in individual data, set prediction technique is also proposed

to further reduce the redundancies among images and videos. For a set of similar

images, a representative image (e.g., an average image) is selected or generated from

them and then compressing the images by subtracting it (Musatenko and Kurashov

1998 ; Karadimitriou and Tyler 1998 ). Another class of approaches organizes the

similar images as a video sequence according to their correlation, and then com-

presses the sequence like a video (Chen et al. 2004 ; Zou et al. 2013 ). Extending

to videos, several near-duplicate videos are also able to be jointly compressed and

the current frame can choose the reference frame from the video itself or the other

coded similar videos (Wang et al. 2014 ). After the prediction of individual or a set

of images and videos, the prediction residuals (created by subtracting the prediction

from the actual current signals) are usually signals with smaller values and generally

have centralized distribution around zero.

Search WWH ::

Custom Search

Home