Game Development Reference
In-Depth Information
where X and Y represent the input and output data which are generally both matrices
of size M
N , C and R denote the transform matrix applied on the column and row
vectors of X , C and R are N
×
M transform matrices, respectively.
To reconstruct the original input data, an inverse transform process is applied
essentially. For the 2-D case, the inverse transform is formulated as follows:
×
N and M
×
X
C ·
R ,
=
X
·
(5.3)
R equals the identity matrix, which is the most common
scenario, X equals X as the transformand inverse transformcan losslessly reconstruct
the input data.
By applying the transform, i.e., the mathematical operation of Eq. 5.1 or Eq. 5.2 ,
the input vector or block, which generally consists of samples that are correlated
with their neighbors, is reorganized as another vector or block of which the entities
are commonly named as transform coefficients. In the context of data coding, the
transform is designed to derive transform coefficients which are nearly uncorrelated
with each other. For jointly Gaussian random variables, which are usually utilized to
simulate natural image signals, uncorrelated random variables are also independent.
Therefore, an efficient transform process, which is able to largely de-correlate the
input sources, helps the entropy coding to be efficiently applied.
It is noted from Eq. 5.2 that, to achieve uncorrelated transform coefficients, the
transform design, i.e., C and R , is tied with the statistical distribution of the input
source X . That is, different transform design may apply for sources differently dis-
tributed. It was well known that the optimum transform for a given input data source
is Karhunen-Loève transform (KLT) (Kolmogoroff 1931 ;Loeve 1978 ), of which
the transform matrix consists of the eigenvectors of the co-variance matrix. In the
scenario of video coding, Discrete Cosine Transform (DCT) (Ahmed et al. 1974 )
becomes the most popular transform design because of its lower complexity and
higher efficiency. For complexity, due to the fixed and symmetrical transformmatrix,
DCT can be efficiently implemented with fast and low-complexity algorithms. For
efficiency in terms of de-correlation, it has been both theoretically proved and exper-
imentally confirmed that DCT approximates the optimum KLT under the first-order
Markov conditions which approximate natural imagery sources (Clarke 1981 ).
For practical implementation, DCT is further approximated by very limited
choices of transform sizes and integer point operations. Furthermore, continuous
efforts have been dedicated by researchers to exploit more flexible transform design
in recent years. As image sources are generally 2-D and uncertain, fixed size square
DCT design cannot efficiently capture the correlation of all image sources. The limi-
tation of the current DCT designmake room for further improvement of the transform
efficiency. Several novel transform designs have been proposed in the literature and
some have been successfully applied in the latest video coding standards, such as
AVS2 and High-Efficiency Video Coding (HEVC). In the next Sect. 5.2 , we will
discuss more on the detailed technical design of transform, especially for AVS2.
C and R
when both C
·
·
 
Search WWH ::




Custom Search