Information Technology Reference
In-Depth Information
MPEG-4 also has the ability to downsample prediction error or residual macroblocks which contain little detail. A 16
x 16 macroblock block is downsampled to 8 x 8 and flagged. The decoder will identify the flag and interpolate back
to 16 x 16.
In vector prediction, each macroblock may have only one or four vectors as the coder decides. Consequently the
prediction of a current vector may have to be done from either macroblock or DCT block vectors. In the case of
predicting one vector for an entire macroblock, or the top- left DCT block vector, the process shown in Figure
5.51(b) is used. Three earlier vectors, which may be macroblock or DCT block vectors, as available, are used as
the input to the prediction process. In the diagram the large squares show the macroblock vectors to be selected
and the small squares show the DCT block vectors to be selected. The three vectors are passed to a median filter
which outputs the vector in the centre of the range unchanged.
A median filter is used because the same process can be performed in the decoder with no additional data
transmission. The median vector is used as a prediction, and comparison with the actual vector enables a residual
to be computed and coded for transmission. At the decoder the same prediction can be made and the received
residual is added to re- create the original vector.
The remaining parts of Figure 5.51(b) show how the remaining three DCT block vectors are predicted from
adjacent DCT block vectors. If the relevant block is only macroblock coded, that vector will be substituted.
5.22 Shape coding
Shape coding is the process of compressing alpha or keying data. Most objects are opaque and so a binary alpha
signal is adequate for the base- level shape system. For each texture pixel, an alpha bit exists forming a binary
alpha map which is effectively a two-dimensional mask through which the object can be seen. At the decoder
binary alpha data are converted to 000 or 255 levels in an eight-bit keying system.
Optionally the object can be faded in the compositing process by sending a constant alpha value at each VOP. As
a further option, variable transparency can be supported by sending alpha texture data . This is coded using the
usual MPEG spatial coding tools such as DCT and scanning etc.
Binary data such as alpha data do not respond to DCT coding, and another compression technique has been
developed for binary alpha blocks (bab). This is known as context-based coding and it effectively codes the
location of the boundary between alpha one bits and alpha zero bits. Clearly once the boundary is located the
values of all remaining alpha bits are obvious.
The babs are raster scanned into a serial bitstream. Context coding works by attempting to predict the state of the
current bit from a set of bits which have already been decoded. Figure 5.53(a) shows the set of bits, known as a
context, used in an intra-coded VOP. There are ten bits in the context and so there can be 1024 different contexts.
Extensive analysis of real shapes shows that for each context there is a certain probability that the current bit will
be zero. This probability exists as a standardized look-up table in the encoder and decoder.
Figure 5.53: (a) The context is a set of bits used for prediction of the current bab bit. (b) The encoder compares the
 
Search WWH ::




Custom Search