Digital Signal Processing Reference
In-Depth Information
M01
M02
M03
M04
M05
M06
M07
M08
M09
M10
A B
ALU1
C D
E
A B
ALU2
Out1
C D
E
A B
ALU3
Out1
C D
E
A B
ALU4
Out1
C D
E
A B
ALU5
Out1
C D
W
W
W
W
Out2
Out1
Out2
Out2
Out2
Out2
Instruction decoding
Sequencer
Communication and configuration unit
FIgure 15.2
The Montium tile processor and network interface.
Furthermore, there are multiple ALUs (ALU1 … ALU5) and multiple memories
(M01 … M10). A single ALU has four inputs (A, B, C, D). Each input has a private input
register file that can store up to four operands. The input register file cannot be bypassed,
i.e., an operand is always read from an input register. Input registers can be written by
various sources via a flexible interconnect. An ALU has two outputs (OUT1, OUT2),
which are connected to the interconnect. The ALU is entirely combinational, and con-
sequently, there are no pipeline registers within the ALU. Neighboring ALUs can also
communicate directly: the west output (W) of an ALU connects to the east input (E) of
the ALU neighboring on the left.
The ALUs support both signed integer and signed fixed-point arithmetic. The five
identical ALUs in a tile can exploit spatial concurrency to enhance performance. This
parallelism demands a very high memory bandwidth, which is obtained by having ten
local memories in parallel.
An address generation unit (AGU; not shown in Figure 15.2) accompanies each mem-
ory. The AGU can generate the typical memory access patterns found in common DSP
algorithms, e.g., incremental, decremental, and bit-reversal addressing. It is also possible
to use the memory as a lookup table for complicated functions that cannot be calculated
 
Search WWH ::




Custom Search