Digital Signal Processing Reference
In-Depth Information
b
b
ou t 1
+
ou t 1
a
+
a
2
in
2
x
in
c
x
1
c
out 2
1
+
out 2
+
3
3
(a)
(b)
Figure 7.4
(a) Feedforward cut-set. (b) Pipeline registers added on each edge of the cut-set
Maintaining data coherency is a critical issue, but a cut-set eases the task of the designer in figuring
out the coherency issues in a complex design.
Figure 7.4 shows an example of a feedforward cut-set. If the two feedforward edges 1 ! 2
and 1
3 are removed, the graph becomes disjoint consisting of node 1 in one graph and
nodes 2 and 3 in the other graph, and two parallel paths from input to output, in ! 1 ! 2 ! out 1
and
!
! out 2 , intersect this cut-set line once.
Adding N registers to each edge of a feedforward cut-set of a DFG maintains data coherency,
but the respective output is delayed by N cycles. Figure 7.4(b) shows three pipeline registers that
are added on each edge of the cut-set of Figure 7.4(a). The respective outputs are delayed by three
cycles.
Example: The difference equation and corresponding transfer function H(z) of a 5-coefficient
FIR filter are:
in !
1
!
3
y n ¼ h 0 x n þh 1 x n 1 þ h 2 x n 2 þh 3 x n 3 þh 4 x n 4
ð 7 : 1a Þ
HðÞ¼h 0 þ h 1 z 1
þ h 2 z 2
þ h 3 z 3
þ h 4 z 4
ð
7
1b
Þ
:
Assuming that general-purpose multipliers and adders are used, the critical path delay of the
DFG consists of accumulated delay of the combinational cloud of one multiplier T mult , and four
adders 4
1 tu, then the critical path of the
design is 6 tu. It is desired to reduce this critical path by partitioning the design into two levels of
pipeline stages. One feedforward cut-set can be used for appropriately adding one pipeline register
in the design. Two possible cut-set lines that reduce the critical path of the design to 4 tu are shown in
Figure 7.5(a). Cut-set line 2 is selected for adding pipelining as it requires the addition of only two
registers. The registers are added and the pipeline design is shown in Figure 7.5(b). This reflects that
the pipeline improvement is limited by the slowest pipeline stage. In this example the slowest
pipeline stage consists of a multiplier and two adders.
Although the potential speed-up of two-level pipelining is two times the original design, the
potential speed-up is not achieved owing to the unbalanced length of pipeline stages. The optimal
speed-up requires breaking the critical path into two exact lengths of 3 tu. This requires
pipelining inside the adder, which may not be very convenient to implement or may require
T adder . Assuming T mult is 2 time units (tu) and T adder
¼
 
Search WWH ::




Custom Search