Databases Reference
In-Depth Information
signifying negative values. Then the remaining quantizer labels are encoded in reverse scan
order. After this, the total number of 0s in the scan between the beginning of the scan and the
last nonzero label is encoded. This will be a number between 0 and 16
N . Then the run of
zeros before each label starting with the last nonzero label is encoded until we run out of zeros
or coefficients. The number of bits used to code each zero run will depend on the number of
zeros remaining to be assigned.
In the second technique, which provides higher compression, all values are first converted
to binary strings. This binarization is performed, depending on the data type, using unary
codes, truncated unary codes, exponential Golomb codes, and fixed-length codes, plus five
specific binary trees for encoding macroblock and sub-macroblock types. The binary string is
encoded in one of two ways. Redundant strings are encoded using a context-adaptive binary
arithmetic code. Binary strings that are random, such as the suffixes of the exponential Golomb
codes, bypass the arithmetic coder. The arithmetic coder has 399 contexts available to it, with
325 of these contexts used for encoding the quantizer labels. These numbers include contexts
for both frame and field slices. In a pure frame or field slice only 277 of the 399 context
models are used. These context models are simply Cum_Count tables for use with a binary
arithmetic coder. The H.264 standard recommends a multiplication-free implementation of
binary arithmetic coding.
The H.264 standard is substantially more flexible than previous standards, with a much
broader range of applications. In terms of performance, it claims a 50% reduction in bit rate
over previous standards for equivalent perceptual quality [ 262 ].
19.12 MPEG-4 Part 2
The MPEG-4 standard provides a more abstract approach to the coding of multimedia. The
standard views a multimedia “scene” as a collection of objects. These objects can be visual,
such as a still background or a talking head, or aural, such as speech, music, background noise,
and so on. Each of these objects can be coded independently using different techniques to
generate separate elementary bitstreams. These bitstreams are multiplexed along with a scene
description. A language called the Binary Format for Scenes (BIFS) based on the Virtual
Reality Modeling Language (VRML) has been developed by MPEG for scene descriptions.
The decoder can use the scene description and additional input from the user to combine or
compose the objects to reconstruct the original scene or create a variation on it. The protocol for
managing the elementary streams and theirmultiplexed version, called theDeliveryMultimedia
Integration Framework (DMIF), is an important part of MPEG-4. However, as our focus in
this topic is on compression, we will not discuss the protocol (for details, see the standard
[ 221 ]).
A block diagram for the basic video coding algorithm is shown in Figure 19.18 . Although
shape coding occupies a very small portion of the diagram, it is a major part of the algorithm.
The different objects that make up the scene are coded and sent to the multiplexer. The informa-
tion about the presence of these objects is also provided to the motion-compensated predictor,
which can use object-based motion-compensation algorithms to improve the compression ef-
ficiency. What is left after the prediction can be transmitted using a DCT-based coder. The
Search WWH ::




Custom Search