Digital Signal Processing Reference
In-Depth Information
exceeds the multiplier size in the Altera DSP blocks when the wordlength grows
above 18-bits as clearly seen in Fig. 3 b , so the number of blocks goes up as well as
the ALMs in the form of registers and LUTs. With the Xilinx device, the number of
DSP48 blocks remains constant as they can support one input word size of 25-bits
as can be seen in Fig. 3 a . The register and LUT counts then increase accordingly
with word size. This highlights the importance of relating the wordlength chosen
during the algorithmic development stages to the implementation on specific FPGA
platform.
3.2
Retiming
The throughput rates of the FIR filter implementations is not comparable to a single
DSP block performance. This is because the critical path of the FIR filter structure
is given as one multiplier and 63 additions. Of course, a key advantage is that
this speed is not the clock rate but the sampling rate as 64 multiplications and 63
additions are performed on each clock cycle, unlike a sequential processor where
the clock rate has to be divided by the number of operations.
Pipelining can be applied to achieve improved throughput, by performing
retiming [ 5 ] as highlighted in chapter [ 3 ] . Retiming has also been applied in
synchronous designs for clock period reduction [ 5 ] , power consumption reduction
[ 9 ] , and logical synthesis in general.
For a circuit with two edges U and V and
ω
delays between them, a retimed
circuit can be derived where
ω r delays are now present on the retimed edge.
ω r
value is computed using Eq. ( 2 ) where r
(
U
)
and r
(
V
)
are the retimed values for
nodes U and V respectively.
ω r (
e
)= ω (
e
)+
r
(
U
)
r
(
V
)
(2)
Retiming has a number of properties [ 11 ] .
1. Weight of any retimed path is given by Eq. ( 2 ) .
2. Retiming does not change the number of delays in a cycle.
3. Retiming does not alter the iteration bound in a DFG as the number of delays in
a cycle does not change.
4. Adding the constant value, j , to the retiming value of each node does not alter the
number of delays in the edges of the retimed graph.
Of course, the key trick is to be able to work out the retiming values r
(
U
)
and
r
which results in a better retimed graph. A number of retiming routines have
been developed but the most intuitive one is the cut-set retiming technique defined
in [ 4 ] .
(
V
)
 
Search WWH ::




Custom Search