MPEG video compression - The MPEG

Information Technology Reference

In-Depth Information

MPEG-4 also has the ability to downsample prediction error or residual macroblocks which contain little detail. A 16

x 16 macroblock block is downsampled to 8 x 8 and flagged. The decoder will identify the flag and interpolate back

to 16 x 16.

In vector prediction, each macroblock may have only one or four vectors as the coder decides. Consequently the

prediction of a current vector may have to be done from either macroblock or DCT block vectors. In the case of

predicting one vector for an entire macroblock, or the top- left DCT block vector, the process shown in Figure

5.51(b) is used. Three earlier vectors, which may be macroblock or DCT block vectors, as available, are used as

the input to the prediction process. In the diagram the large squares show the macroblock vectors to be selected

and the small squares show the DCT block vectors to be selected. The three vectors are passed to a median filter

which outputs the vector in the centre of the range unchanged.

A median filter is used because the same process can be performed in the decoder with no additional data

transmission. The median vector is used as a prediction, and comparison with the actual vector enables a residual

to be computed and coded for transmission. At the decoder the same prediction can be made and the received

residual is added to re- create the original vector.

The remaining parts of Figure 5.51(b) show how the remaining three DCT block vectors are predicted from

adjacent DCT block vectors. If the relevant block is only macroblock coded, that vector will be substituted.

5.22 Shape coding

Shape coding is the process of compressing alpha or keying data. Most objects are opaque and so a binary alpha

signal is adequate for the base- level shape system. For each texture pixel, an alpha bit exists forming a binary

alpha map which is effectively a two-dimensional mask through which the object can be seen. At the decoder

binary alpha data are converted to 000 or 255 levels in an eight-bit keying system.

Optionally the object can be faded in the compositing process by sending a constant alpha value at each VOP. As

a further option, variable transparency can be supported by sending alpha texture data . This is coded using the

usual MPEG spatial coding tools such as DCT and scanning etc.

Binary data such as alpha data do not respond to DCT coding, and another compression technique has been

developed for binary alpha blocks (bab). This is known as context-based coding and it effectively codes the

location of the boundary between alpha one bits and alpha zero bits. Clearly once the boundary is located the

values of all remaining alpha bits are obvious.

The babs are raster scanned into a serial bitstream. Context coding works by attempting to predict the state of the

current bit from a set of bits which have already been decoded. Figure 5.53(a) shows the set of bits, known as a

context, used in an intra-coded VOP. There are ten bits in the context and so there can be 1024 different contexts.

Extensive analysis of real shapes shows that for each context there is a certain probability that the current bit will

be zero. This probability exists as a standardized look-up table in the encoder and decoder.

Figure 5.53: (a) The context is a set of bits used for prediction of the current bab bit. (b) The encoder compares the

The MPEG

Search WWH ::

Custom Search

Home