FREQUENCY DOMAIN REPRESENTATION - Digital Video Processing for Engineers

Digital Signal Processing Reference

In-Depth Information

This is also the reason why FFT sizes are almost always in

powers of 2 (2, 4, 8, 16, 32, 64..). This is sometimes called a “radix

2” FFT. So rather than a 1000 point FFT, one will see a 1024 point

FFT. In practice, this common restriction to powers of two is not

a problem.

12.3 Discrete Cosine Transform

Next, we will discuss the Discrete Cosine Transform, or DCT.

Until now we have been talking about the DFT or FFT operating

on one-dimensional signals, which are often complex (may have

real and quadrature components). The DCT is usually used for

two-dimensional signals, such as an image presented by a rect-

angular array of pixels, and which is a real signal only. When we

discuss frequency, it will be in the context of how rapidly the

sample values change. We will be sampling spatially across the

image in either the vertical or horizontal direction with the DCT.

The DCT is usually applied across an N by N array of pixel

data. For example, if we take a region composed of eight by eight

pixels, or 64 pixels total, we can transform this into a set of 64 DCT

coefficients, which is the spatial frequency representation of the

eight by eight region of the image. This is very similar to the DCT.

However, instead of expressing the signal as a combination of the

complex exponentials of various frequencies, we will be

expressing the image data as a combination of cosines of various

frequencies, in both vertical and horizontal dimensions.

DFT representation is for a periodic signal, or one that is

assumed to be periodic. Imagine connecting a series of identical

signals together, end to end. Where the end of the sequence

connects to the beginning of the next, there will be a disconti-

nuity, or a step function. This will represent high frequency.

For the DCT, we make an assumption that the signal is folded

over on itself. So an eight-long signal depicted becomes 16-long

when appended as flipped. This 16-long signal is then symmetric

about the midpoint. This is the same property of cosine waves. A

cosine is symmetric about the midpoint, which is at

p

(since the

period is from 0 to 2

). This property is preserved for higher

frequency cosines, as shown by the figures below, showing the

sampled cosine waves. The waveform is eight samples long, and if

folded over will create a 16-long sampled waveform which will be

symmetric, start and end with the same value and has “

p

u

” cycles

across the 16 samples.

To continue requires explanation of some terminology. The

value of a pixel at row x and column y is designated as f(x, y), as

shown in Figure 12.3 . We will compute the DCT coefficients,

Digital Video Processing for Engineers

Search WWH ::

Custom Search

Home