Information Technology Reference
In-Depth Information
superior performance in high resolution scenarios [9]. Additionally, the possibility
of adaptively selecting the transform kernel (4x4 or 8x8) on a macroblock basis
was also added. The employed transforms consists of two matrix multiplications,
which can be executed using integer arithmetic, and a scaling operation which in
principle requires floating point operations. For example, the 4x4 transform is de-
fined as:
11 1 1
xxxx
11 1
d
⎞ ⎛
a ba b
2
2
1
2
3
4
⎟ ⎜
1
dd
1
xxxx
1
d
1
1
2
2
ab
b
ab
b
(
)
5
6
7
8
Y
=
C E
t
⊗ =
⎟ ⎜
1111
−−
xx x x
1
−−
d
11
2
2
a ba b
9
10
11
12
⎟ ⎜
d
11
d
xxxx
1 11
d
2
2
ab
b
ab
b
13
14
15
16
1
2
1
2
a
== =
,
b
,
d
2
5
However, by absorbing this scaling into the quantization process, floating point
arithmetic is avoided and the transform process becomes an integer-only opera-
tion. In this way, an efficient implementation is obtained and the drift problem that
occurred in previous video coding standards, as a result of differences in accuracy
and rounding of the floating-point DCT in the encoder and decoder, is avoided.
The standard enforces a link between the prediction mode and the size of the
transform kernel used: If prediction on 4x4 blocks is used, the 8x8 transform ker-
nel cannot be employed as it would cross the boundary of the 4x4 blocks used in
the prediction, causing high frequency transform coefficients to appear which are
expensive to code. For more information concerning the transform part of the
standard the reader is referred to [12].
In the next stage, the transform coefficients are quantized. In the initial version
of the standard, only uniform scalar quantization was supported. FRExt later in-
troduced support for frequency dependent quantization and rounding, by means of
custom quantization and rounding matrices. The quantization strength is deter-
mined by the quantization step size which can be defined for each macroblock
using the quantization parameter QP which lies in the range [0,51]. The relation
between QP and the quantization step size is logarithmic: the quantization step
size approximately doubles for each increase of QP by 6. As mentioned earlier,
the quantization and the scaling part of the DCT are combined in a single integer-
valued operation.
The symbols produced by the encoding process are entropy coded using either
context-based adaptive variable length coding (CAVLC) or context-based adap-
tive binary arithmetic coding (CABAC [13]). While CABAC exhibits a higher
computational complexity, it also provides 10 to 15% bit-rate savings compared to
CAVLC [13, 14].
Similar to prior video coding standards, H.264/AVC also supports efficient
coding of interlaced material. Each interlaced frame can be coded in two ways.
The subsequent fields can be coded separately (field coding) or the interlaced
frame, i.e. the collection of two successive fields can be coded in the same way as
a progressive frame (frame coding). In H.264/AVC, this decision can be made
adaptively, on a slice basis (PAFF, picture adaptive frame/field coding). If field
Search WWH ::




Custom Search