Digital Signal Processing Reference
In-Depth Information
Tabl e 4
Performance comparison for the fourth order IIR filter
# of cycles
SQNR
Floating-p.
Integer
Speed-up
Floating-p.
Integer (dB)
'C50
2,980
100
29.8
-
49.3
'C60
3,659
9
406.6
-
57.9
56000
26,282
921
28.5
-
78.5
Tabl e 5
Shift reduction results of the fourth order IIR filter
0
IWL increment upper bound
(No shift reduction)
3
Infinite
# of shifts in C codes
7
4
2
'C50
# of cycles
100
96
94
Speedup
-
4%
6%
SQNR (dB)
49.3
51.2
54.1
C60
# of cycles
9
6
8
Speedup
-
33%
11%
SQNR (dB)
57.9
57.1
54.2
# of shifts in C codes
5
3
2
56000
# of cycles
921
675
577
Speedup
-
27%
37%
SQNR (dB)
78.5
78.5
78.5
In the fourth order IIR filter, the speed-up, which is the ratio in the execution time
of the integer to the floating-point versions, was 29.8, 406, and 28.5 for 'C50, 'C60,
and Motorola 56000, respectively, as shown in Table 4 . The remarkable speed-up
of 'C60 is mainly due to the deeply pipelined VLIW architecture having a large
register file and an efficient C compiler. This machine can execute up to 8 integer
operations in one cycle and store all the variables of a small loop kernel in the
registers, but needs a large number of no-operation cycles for floating-point function
calls to flush pipeline registers. The compiler for 'C60 is very efficient because it
has several compiler friendly components, such as large general purpose register
files, an orthogonal instruction set and a VLIW scheduler [ 7 ] . The developed shift
reduction technique is applied to this example. The number of shift operations in
the converted C code is reduced from 7 to 2 without imposing an IWL upper bound
for TI's 'C50 and 'C60. The number of shifts in the C code for Motorola 56000
is different from that of TI's DSP's, because the IWL of multiplication results is
different as described in the previous section. The cycle counts of the shift reduced
codes are shown in Table 5 . As shown in this Table, 'C60 achieves 33% of speed-up
increase using the shift optimization. 'C50 shows a relatively low speed-up because
the shifts can be performed by load-store instructions with no additional cycle in
'C50. For the Motorola 56000, a high speed-up can be achieved because its shift
cost is much higher than that of the other DSP's employing barrel shifters. The
SQNR of the fixed-point implementations was measured as 49.3 dB, 57.9 dB and
 
 
 
Search WWH ::




Custom Search