Information Technology Reference
In-Depth Information
placed in a frame store. Decoding continues in this way until the end of the slice when an absolute DC coefficient
will once again be sent. Once all the slices have been decoded, an entire picture will be resident in the frame store.
The amount of data needed to decode the picture is variable and the decoder just keeps going until the last
macroblock is found. It will obtain data from the input buffer. In a constant bit rate transmission system, the decoder
will remove more data to decode an I picture than has been received in one picture period, leaving the buffer
emptier than it began. Subsequent P and B pictures need much fewer data and allow the buffer to fill again.
Figure 5.42: A bi-directional MPEG-2 decoder. See text for details.
The picture will be output when the time stamp (see Chapter 6 ) sent with the picture matches the state of the
decoder's time count.
Following the I picture may be another I picture or a P picture. Assuming a P picture, this will be predictively coded
from the I picture. The P picture will be divided into slices as before. The first vector in a slice is absolute, but
subsequent vectors are sent differentially. However, the DC coefficients are not differential.
Each macroblock may contain a forward vector. The decoder uses this to shift pixels from the I picture into the
correct position for the predicted P picture. The vectors have half-pixel resolution and where a half-pixel shift is
required, an interpolator will be used.
The DCT data is sent much as for an I picture. It will require inverse quantizing, but not inverse weighting because
P and B coefficients are flat-weighted. When decoded this represents an error-cancelling picture which is added
pixel-by-pixel to the motion-predicted picture. This results in the output picture.
If bidirectional coding is being used, the P picture may be stored until one or more B pictures have been decoded.
The B pictures are sent essentially as a P picture might be, except that the vectors can be forward, backward or
bidirectional. The decoder must take pixels from the I picture, the P picture, or both, and shift them according to the
vectors to make a predicted picture. The DCT data decodes to produce an error- cancelling image as before.
In an interlaced system, the prediction mechanism may alternatively obtain pixel data from the previous field or the
field before that. Vectors may relate to macroblocks or to 16 x 8 pixel areas. DCT blocks after decoding may
represent frame lines or field lines. This adds up to a lot of different possibilities for a decoder handling an
interlaced input.
5.19 MPEG-4
As was seen in Chapter 1 , MPEG-4 advances the coding art in a number of ways. Whereas MPEG-1 and MPEG-2
were directed only to coding the video pictures which resulted after shooting natural scenes or from computer
synthesis, MPEG-4 also moves further back in the process of how those scenes were created. For example, the
rotation of a detailed three-dimensional object before a video camera produces huge changes in the video from
picture to picture which MPEG-2 would find difficult to code. Instead, if the three-dimensional object is re-created at
the decoder, rotation can be portrayed by transmitting a trivially small amount of vector data.
If the above object is synthetic, effectively the synthesis or rendering process is completed in the decoder.
However, a suitable if complex image processor at the encoder could identify such objects in natural scenes.
MPEG-4 objects are defined as a part of a scene which can independently be accessed or manipulated. An object
is an entity that exists over a certain time span. The pictures of conventional imaging become object planes in
MPEG-4. Where an object intersects an object plane, it can be described by the coding system using intra-coding,
forward prediction or bidirectional prediction.
Figure 5.43 shows that MPEG-4 has four object types. A video object is an arbitrarily shaped planar pixel array
describing the appearance or texture of part of a scene. A still texture object or sprite is a planar video object in
which there is no change with respect to time. A mesh object describes a two- or three-dimensional shape as a set
of points. The shape and its position can change with respect to time. Using computer graphics techniques, texture
can be mapped onto meshes, a process known as warping, to produce rendered images.
 
Search WWH ::




Custom Search