Compression Formats for HD Recording and Production - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

superior performance in high resolution scenarios [9]. Additionally, the possibility

of adaptively selecting the transform kernel (4x4 or 8x8) on a macroblock basis

was also added. The employed transforms consists of two matrix multiplications,

which can be executed using integer arithmetic, and a scaling operation which in

principle requires floating point operations. For example, the 4x4 transform is de-

fined as:

⎛

11 1 1

⎛

xxxx

⎞

11 1

⎞ ⎛

a ba b

⎞

⎛

⎞

⎛

⎞

⎜

⎟

⎟ ⎜

⎟

⎜

⎟

⎜

⎟

−

xxxx

−

(

)

⎜

⎟ ⎜

⎜

⎟

⎜

⎟

⎜

⎟

C E

⊗ =

⊗

⎜

⎟ ⎜

⎟

⎜

1111

−−

⎟

⎜

xx x x

⎟

⎜

−−

⎟

a ba b

⎜

⎟

⎜

⎟

⎜

⎟ ⎜

⎟

⎜

⎟

−

xxxx

1 11

−

⎝

⎠

⎝

⎠

⎝

⎠

⎝

⎠

⎝

⎠

== =

However, by absorbing this scaling into the quantization process, floating point

arithmetic is avoided and the transform process becomes an integer-only opera-

tion. In this way, an efficient implementation is obtained and the drift problem that

occurred in previous video coding standards, as a result of differences in accuracy

and rounding of the floating-point DCT in the encoder and decoder, is avoided.

The standard enforces a link between the prediction mode and the size of the

transform kernel used: If prediction on 4x4 blocks is used, the 8x8 transform ker-

nel cannot be employed as it would cross the boundary of the 4x4 blocks used in

the prediction, causing high frequency transform coefficients to appear which are

expensive to code. For more information concerning the transform part of the

standard the reader is referred to [12].

In the next stage, the transform coefficients are quantized. In the initial version

of the standard, only uniform scalar quantization was supported. FRExt later in-

troduced support for frequency dependent quantization and rounding, by means of

custom quantization and rounding matrices. The quantization strength is deter-

mined by the quantization step size which can be defined for each macroblock

using the quantization parameter QP which lies in the range [0,51]. The relation

between QP and the quantization step size is logarithmic: the quantization step

size approximately doubles for each increase of QP by 6. As mentioned earlier,

the quantization and the scaling part of the DCT are combined in a single integer-

valued operation.

The symbols produced by the encoding process are entropy coded using either

context-based adaptive variable length coding (CAVLC) or context-based adap-

tive binary arithmetic coding (CABAC [13]). While CABAC exhibits a higher

computational complexity, it also provides 10 to 15% bit-rate savings compared to

CAVLC [13, 14].

Similar to prior video coding standards, H.264/AVC also supports efficient

coding of interlaced material. Each interlaced frame can be coded in two ways.

The subsequent fields can be coded separately (field coding) or the interlaced

frame, i.e. the collection of two successive fields can be coded in the same way as

a progressive frame (frame coding). In H.264/AVC, this decision can be made

adaptively, on a slice basis (PAFF, picture adaptive frame/field coding). If field

High-Quality Visual Experience

Search WWH ::

Custom Search

Home