Digital Signal Processing Reference
In-Depth Information
This is also the reason why FFT sizes are almost always in
powers of 2 (2, 4, 8, 16, 32, 64..). This is sometimes called a “radix
2” FFT. So rather than a 1000 point FFT, one will see a 1024 point
FFT. In practice, this common restriction to powers of two is not
a problem.
12.3 Discrete Cosine Transform
Next, we will discuss the Discrete Cosine Transform, or DCT.
Until now we have been talking about the DFT or FFT operating
on one-dimensional signals, which are often complex (may have
real and quadrature components). The DCT is usually used for
two-dimensional signals, such as an image presented by a rect-
angular array of pixels, and which is a real signal only. When we
discuss frequency, it will be in the context of how rapidly the
sample values change. We will be sampling spatially across the
image in either the vertical or horizontal direction with the DCT.
The DCT is usually applied across an N by N array of pixel
data. For example, if we take a region composed of eight by eight
pixels, or 64 pixels total, we can transform this into a set of 64 DCT
coefficients, which is the spatial frequency representation of the
eight by eight region of the image. This is very similar to the DCT.
However, instead of expressing the signal as a combination of the
complex exponentials of various frequencies, we will be
expressing the image data as a combination of cosines of various
frequencies, in both vertical and horizontal dimensions.
DFT representation is for a periodic signal, or one that is
assumed to be periodic. Imagine connecting a series of identical
signals together, end to end. Where the end of the sequence
connects to the beginning of the next, there will be a disconti-
nuity, or a step function. This will represent high frequency.
For the DCT, we make an assumption that the signal is folded
over on itself. So an eight-long signal depicted becomes 16-long
when appended as flipped. This 16-long signal is then symmetric
about the midpoint. This is the same property of cosine waves. A
cosine is symmetric about the midpoint, which is at
p
(since the
period is from 0 to 2
). This property is preserved for higher
frequency cosines, as shown by the figures below, showing the
sampled cosine waves. The waveform is eight samples long, and if
folded over will create a 16-long sampled waveform which will be
symmetric, start and end with the same value and has “
p
u
” cycles
across the 16 samples.
To continue requires explanation of some terminology. The
value of a pixel at row x and column y is designated as f(x, y), as
shown in Figure 12.3 . We will compute the DCT coefficients,
Search WWH ::




Custom Search