Digital Signal Processing Reference
In-Depth Information
7.3 Structure of the Video Elementary Stream
The smallest unit of the video stream is a block consisting of 8 x 8 pixels.
Each block is subjected to a separate Discrete Cosine Transform (DCT)
during the encoding. In the case of a 4:2:0 profile, four luminance blocks
and one C B block and one C R block in each case together form one mac-
roblock. Each macroblock can exhibit a different amount of quantization,
i.e. be compressed to a greater or lesser extent. To this end, the video en-
coder can select different scaling factors by which each DCT coefficient is
additionally divided. These quantizer scaling factors are the actual “set
screws” for the data rate of the video PES stream. The quantization table
itself cannot be exchanged from macroblock to macroblock. Each macrob-
lock can be either frame encoded or field encoded. This is decided by the
encoder on the basis of necessity and opportunity. One necessity for field
encoding arises from the existence of motion components between the first
and second field and an opportunity is presented by the available data rate.
Together, a certain number of macroblocks in a row form a slice
(Fig.7.28.). Each slice starts with a header which is used for resynchroniza-
tion, e.g. in the case of bit errors. At the level of the video stream, error
concealment mainly takes place at slice level, i.e. in the case of bit errors,
the MPEG decoders copy the slice of the preceding frame into the current
frame. The MPEG decoder can resynchronize itself again with the begin-
ning of a new slice. The shorter the slices, the lower the interference
caused by bit errors.
Many slices together will then form a frame (picture). A frame, too,
starts with a header, the picture header. There are different types of frames,
called I (intraframe) frame, P (predicted) frame and B (bidirectionally pre-
dicted) frame. Because of the bidirectional differential coding, the order of
the frames does not correspond to the original order and the headers and
especially the PES headers, therefore, carry a time stamp so that the origi-
nal order can be restored (DTS).
Together, a certain number of frames corresponding to a coding pattern
of the I, P and B frame coding predetermined by the encoder, form a group
of pictures (GOP). Each GOP has a GOP header. In broadcasting, rela-
tively short GOPs are used which, as a rule, have a length of about 12
frames, i.e about half a second. The MPEG decoder can only lock to the
signal and begin to reproduce pictures when it receives the start of a GOP,
i.e. the first I frame. Longer GOPs can be chosen for mass storage devices
such as the DVD since it is easy to position their read head on the first I
frame.
Search WWH ::




Custom Search