Information Technology Reference
In-Depth Information
Figure 5.28 shows a variety of possible GOP (group of pictures in MPEG- 1 and MPEG-2) or GOV (group of video
object planes in MPEG-4) structures. The simplest is the III . . . sequence in which every picture (or object in
MPEG-4) is intra-coded. These can be fully decoded without reference to any other picture or object and so editing
is straightforward. However, this approach requires about two-and-one-half times the bit rate of a full bidirectional
system.
Figure 5.28: Various possible GOP structures used with MPEG. See text for details.
Bidirectional coding is most useful for final delivery of post-produced material either by broadcast or on prerecorded
media as there is then no editing requirement. As a compromise the IBIB . . . structure can be used which has
some of the bit rate advantage of bidirectional coding but without too much latency. It is possible to edit an IBIB
stream by performing some processing. If it is required to remove the video following a B picture, that B picture
could not be decoded because it needs I pictures either side of it for bidirectional decoding. The solution is to
decode the B picture first, and then re-encode it with forward prediction only from the previous I picture. The
subsequent I picture can then be replaced by an edit process. Some quality loss is inevitable in this process but
this is acceptable in applications such as ENG and industrial video.
5.11 Intra-coding
Intra-coding or spatial compression in MPEG is used in I pictures on actual picture data and in P and B pictures on
prediction error data. MPEG-1 and MPEG-2 use the discrete cosine transform described in section 3.13 . In still
pictures, MPEG-4 may also use the wavelet transform described in section 3.14 .
Entering the spatial frequency domain has two main advantages. It allows dominant spatial frequencies which
occur in real images to be efficiently coded, and it allows noise shaping to be used. As the eye is not uniformly
sensitive to noise at all spatial frequencies, dividing the information up into frequency bands allows a different noise
level to be produced in each.
The DCT works on blocks and in MPEG these are 8 x 8 pixels. Section 5.7 showed how the macroblocks of the
motion compensation structure are designed so they can be broken down into 8 x 8 DCT blocks. In a 4:2:0
macroblock there will be six DCT blocks whereas in a 4:2:2 macroblock there will be eight.
Figure 5.29 shows the table of basis functions or wave table for an 8 x 8 DCT. Adding these two-dimensional
waveforms together in different proportions will give any original 8 x 8 pixel block. The coefficients of the DCT
simply control the proportion of each wave which is added in the inverse transform. The top-left wave has no
modulation at all because it conveys the DC component of the block. This coefficient will be a unipolar (positive
only) value in the case of luminance and will typically be the largest value in the block as the spectrum of typical
video signals is dominated by the DC component.
 
Search WWH ::




Custom Search