Unfolding and Folding of Architectures - Digital Design of Signal Processing Systems: A Practical Approach - page 353

Digital Signal Processing Reference

In-Depth Information

x 2n

h 3

h 2

h 1

h 0

y 2n

h 3

h 2

h 1

h 0

x 2n+1

h 3

h 2

h 1

h 0

y n

y 2n+1

Compression

tree (CT)

Compression

tree (CT)

Compression

tree (CT)

(a)

(b)

Figure 8.7 Unrolling an FIR filter. (a) Four-coefficient FIR filter. (b) The filter is unrolled by a factor

of 2

additional resources. The unfolded architecture can now be explored for further optimization.

Figure 8.7 shows a 4-coefficient FIR filter and a design after unfolding by a factor of 2. The

designer can nowdesign a computational unit consisting of twoCSDmultipliers and two adders as

one computational unit. This unit can be implemented as a compression tree producing a sumand a

carry. The architecture can also further exploit common sub-expression elimination (CSE)

techniques (see Chapter 6).

It is important to point out that the design can also be pipelined for effective throughput increase.

Inmany designs, simple pipeliningwithout any foldingmay cost less in terms of HWthan unfolding,

because unfolding creates a number of copies of the entire design.

8.4.5 Unfolding for Effective Use of FPGA Resources

Consider a design instance where the throughput is required to be increased by a factor of 2. Assume

the designer is using an FPGAwith embedded DSP48 blocks. The designer can easily add additional

pipeline registers and retime them between a multiplier and an adder, as shown in Figure 8.8. The

Each unit mapped on DSP48

x n

h 3

h 2

h 1

h 0

add_reg[3]

mul_reg[0]

y n

0

add_reg[0]

Figure 8.8 Pipelined FIR filter for effective mapping on FPGAs with DSP48 blocks

Next Page

Digital Design of Signal Processing Systems: A Practical Approach

Search WWH ::

Custom Search

Home