The H.264 Codec - A Practical Guide to Video and Audio Compression

Image Processing Reference

In-Depth Information

The de-blocking filter removes this artifact at the expense of some detail in the

final image. Because it is now defined as part of the standard, the predictive coding

process at the encoder extracts some residual information that is passed on to the de-

blocking filter as hints to assist in de-blocking the image. That way the de-blocking

process is not just a blind defocusing approach but an informed noise-reduction

technique.

13.9.3

Motion Estimation Improvements

The improved quarter-pixel resolution deployed in MPEG-4 part 2 yielded some slight

improvements in quality but these were not as good as they could be. H.264 improves the

motion estimation by changing the way that sub-pixels are interpolated.

Up to this point interpolation has been a straightforward linear proportion between

two end-point values. H.264 introduces a much more accurate interpolation of the sub-

pixel value but degenerates to the simpler interpolation if the resulting output compresses

more efficiently or accurately.

Motion estimation is applied at the 4

×

4 sub-macroblock level. A motion vector

is described for the 16

×

16 macroblock as a whole and some additional motion vectors are

4 parts of it. Those component parts within the macroblock

are organized as non-rectangular shapes within the block.

described for smaller 4

×

13.10

Issues with GOP Structures, Frames, and Slices

The H.264 codec introduces a large number of new and complex mechanisms for referring

between frames. This makes the group of pictures (GOP) closure very hard to do unless

the encoder forces an initial I-frame and closing P-frame. The encoder must also prohibit

some of the bi-directional references to target frames outside of the GOP.

13.10.1

Temporal Prediction Improvements

The I-frame, P-frame, and B-frame model used in MPEG-1, MPEG-2, and MPEG-4 part 2

is modified considerably in H.264.

The references are now made at the slice level rather than at the frame level. The

slices are referenced in as many as 16 other frames rather than within a single frame as was

the case with the earlier codecs.

As far as the decoder is concerned, this is only a trivial addition to the decoding

complexity. At the encoding end of the process, much more information must be cached

and for a longer time. In addition, the processing overhead requires motion-compensa-

tion searches to take place within several frames where it only used to have to search

one. This is a significant computational increase and is one reason why the H.264

encoder is said to be slow compared with older codecs. When viewed in the context of

the massive improvements in bit-rate efficiency, this increase in computational load is

not as costly.

A Practical Guide to Video and Audio Compression

Search WWH ::

Custom Search

Home