MPEG video compression - The MPEG

Information Technology Reference

In-Depth Information

Figure 5.28 shows a variety of possible GOP (group of pictures in MPEG- 1 and MPEG-2) or GOV (group of video

object planes in MPEG-4) structures. The simplest is the III . . . sequence in which every picture (or object in

MPEG-4) is intra-coded. These can be fully decoded without reference to any other picture or object and so editing

is straightforward. However, this approach requires about two-and-one-half times the bit rate of a full bidirectional

system.

Figure 5.28: Various possible GOP structures used with MPEG. See text for details.

Bidirectional coding is most useful for final delivery of post-produced material either by broadcast or on prerecorded

media as there is then no editing requirement. As a compromise the IBIB . . . structure can be used which has

some of the bit rate advantage of bidirectional coding but without too much latency. It is possible to edit an IBIB

stream by performing some processing. If it is required to remove the video following a B picture, that B picture

could not be decoded because it needs I pictures either side of it for bidirectional decoding. The solution is to

decode the B picture first, and then re-encode it with forward prediction only from the previous I picture. The

subsequent I picture can then be replaced by an edit process. Some quality loss is inevitable in this process but

this is acceptable in applications such as ENG and industrial video.

5.11 Intra-coding

Intra-coding or spatial compression in MPEG is used in I pictures on actual picture data and in P and B pictures on

prediction error data. MPEG-1 and MPEG-2 use the discrete cosine transform described in section 3.13 . In still

pictures, MPEG-4 may also use the wavelet transform described in section 3.14 .

Entering the spatial frequency domain has two main advantages. It allows dominant spatial frequencies which

occur in real images to be efficiently coded, and it allows noise shaping to be used. As the eye is not uniformly

sensitive to noise at all spatial frequencies, dividing the information up into frequency bands allows a different noise

level to be produced in each.

The DCT works on blocks and in MPEG these are 8 x 8 pixels. Section 5.7 showed how the macroblocks of the

motion compensation structure are designed so they can be broken down into 8 x 8 DCT blocks. In a 4:2:0

macroblock there will be six DCT blocks whereas in a 4:2:2 macroblock there will be eight.

Figure 5.29 shows the table of basis functions or wave table for an 8 x 8 DCT. Adding these two-dimensional

waveforms together in different proportions will give any original 8 x 8 pixel block. The coefficients of the DCT

simply control the proportion of each wave which is added in the inverse transform. The top-left wave has no

modulation at all because it conveys the DC component of the block. This coefficient will be a unipolar (positive

only) value in the case of luminance and will typically be the largest value in the block as the spectrum of typical

video signals is dominated by the DC component.

The MPEG

Search WWH ::

Custom Search

Home