Digital Signal Processing Reference
In-Depth Information
using an ALU such as sine or division with a single constant value. A memory can be
used for both integer and fixed-point lookups.
The reconfigurable elements within the Montium are the sequencer, the instruction
decoding block, and the AGUs. Their functionality can be changed at runtime. The
Montium is programmed in two steps. In the first step, the AGUs are configured and
a limited set of instructions is defined by configuring the instruction decoding block.
The sequencer is then instructed to sequentially select the required instructions. A pre-
defined instruction set is available using an assembly type of mnemonics (Montium
assembly). A compiler has been constructed to convert a Montium assembly program
into configuration data for both the instruction decoding block and the sequencer.
15.2.2.2.1 Energy Consumption
Using power estimation tooling, the dynamic power consumption of a typical multiply-
accumulate (MAC) operation in the Montium is estimated to be about 0.5 mW/MHz,
realized in 130 nm complementary metal oxide semiconductor (CMOS) technology. The
area of a single Montium TP is about 2 mm 2 in this technology [26].
15.2.3 Tiled Architecture
Tiled architectures are where relatively complex elements (tiles) are replicated on a sin-
gle integrated circuit (IC). The tiles are interconnected via an on-chip network. Tiled
architectures are becoming increasingly popular because a tile has to be designed only
once, after which it can be copied onto a single IC multiple times. By adding more tiles
onto the chip, it is relatively easy to profit from diminishing feature sizes. The computa-
tion model, programming model, interconnection structure, and memory organization
can stay the same. Below, the following prominent tiled architectures are discussed: the
RAW processor [49], the cell processor [29], the Polaris processor [47], and the Tile64
processor (see www.tilera.com ). Furthermore, the Chameleon heterogeneous tiled
architecture is introduced.
15.2.3.1 The RAW Processor
The RAW processor is one of the earliest tiled architectures. A RAW processor consists
of a set of relatively simple tiles interconnected by a set of switches. Each tile contains
instruction memory, data memories, an ALU, registers, configurable logic, and a pro-
grammable switch with an associated instruction memory. The general idea is that the
internal hardware structure of both the tile and the switches is exposed to the compiler.
This way there are two sets of control logic: operation control for the processor and
sequencing routing instructions for the switches. A consequence is that the burden on
the compiler is high, which leads to relatively long compile times. The configurable logic
in each tile supports a few wide-word or many narrow-word operations and is coarser
than FPGA-based processors.
15.2.3.2 The Cell Processor
The cell processor consists of eight replicas of a synergistic processor element (SPE)
and a (single) power processor element (PPE) with a power core. An SPE consists of a
 
Search WWH ::




Custom Search