Unfolding and Folding of Architectures - Digital Design of Signal Processing Systems: A Practical Approach - page 358

Digital Signal Processing Reference

In-Depth Information

Pipeline adder

x 2n

y 2n

rst_n

Pipeline multiplier

rst_n

a 0

x[n]

y[n]

y[n]

rst_n

x[n]

rst_n

rst_n

rst_n

rst_n

rst_n

rst_n

a 1

rst_n

x 2n+ 1

y 2n+1

rst_n

rst_n

rst_n

rst_n

rst_n

L b

a 0

rst_n

rst_n

rst_n

rst_n

rst_n

a 0

rst_n

a 0

L 1

rst_n

rst_n

rst_n

a 1

a 1

rst_n

a 1

(a)

(b)

(c)

Figure 8.10 Unfolding and retiming of a feedback DFG. (a) Recursive DFG with seven algorithmic

registers. (b) Retiming of resisters for associating algorithmic registers with computational nodes for

effective unfolding. (c) Unfolded design for optimal utilization of algorithmic registers

unfolding factor J. This increase is because, although all the computational nodes are replicated J

times, still the number of registers in the unfolded DFG remains the same. For feedback designs,

unfolding may be effective for design instances where there are abundant algorithmic registers for

pipelining the combinational nodes in the design. In these designs, unfolding followed by retiming

provides flexibility of placing these algorithmic registers in the unfolded design while optimizing

timing. Similarly for feedforward designs, first pipeline registers are added and the design is then

unfolded and retimed for effective placement of registers, as explained in Section 8.4.5.

The registers in DFGs can be retimed for effective pipelining of the combinational cloud. In

cases where the designer is using embedded computational units or computational units with

limited pipeline support, there may exist extra registers that are not used for reducing the critical

path of the design. In these designs the critical path is greater than IPB. For example, the designer

might intend to use already embedded building blocks on an FPGA like DSP48. These blocks

have a fixed pipeline option and extra registers do not help in achieving the IPB. By unfolding and

retiming, the unfolded design can be appropriately mapped on the embedded blocks to effectively

use all the registers.

Figure 8.10(a) shows a design with seven algorithmic registers. The registers can be retimed such

that each computational unit has two registers to be used as pipeline registers, as shown in

Figure 8.10(b). The design is unfolded and registers are retimed for optimal HW mapping, as

shown in Figure 8.10(c). The RTL Verilog code of the three designs is listed here:

/* IIR filter of Fig. 8.10(a), having excessive

algorithmic registers /

module IIRFilter

(

input clk, rst_n,

input signed [15:0] xn, //Q1.15

Next Page

Digital Design of Signal Processing Systems: A Practical Approach

Search WWH ::

Custom Search

Home