Digital Signal Processing Reference
In-Depth Information
processors. All processors are interconnected via a reconfigurable communication net-
work. The adaptive computing machine by QuickSilver [25], is a heterogeneous system-
on-chip. Nodes with different flexibility (adaptive nodes, domain-specific [ASIC-like]
nodes, and programmable nodes) can be interconnected by means of a mesh intercon-
nect. The XPP extreme processor platform [4] consists of clusters of computing elements
and a packet-oriented communication network. A cluster consists of a set of parameter-
izable tiles. There are two types of tiles: memory tiles and ALU tiles. A memory tile has
a capacity of 256 or 512 words. The word size can be either 16, 24, or 32 bits. The archi-
tecture uses a large number of processors, which makes the chip suitable for high-end
applications. Silicon Hive provides reconfigurable accelerators designed according to a
hierarchical approach. At the lowest level, the basic component is a VLIW-like process-
ing and storage element (PSE). At the next level, multiple PSEs form a cell. A cell is a
processor, capable of executing complete algorithms. One level higher, multiple cells can
be combined. The accelerators are to be integrated onto a SoC. The ADRES architecture
template [6] consists of a tightly coupled VLIW processor and a coarse-grained recon-
figurable array. The reconfigurable array is intended to process computationally inten-
sive kernels of applications. The VLIW host processor uses parts of the reconfigurable
array to execute its instructions. The host processor and reconfigurable array are there-
fore tightly coupled and share resources. This architecture is dedicated to applications
that require tight control of data flow operations. A similar approach is used in the Chi-
maera architecture [24]. In this architecture, the reconfigurable part has direct access to
the host processor's register file. The Kilocore KC256 chip is a commercial version of the
PipeRench chip [21]. The chips are characterized by a multicore computing kernel where
cores can be cascaded to constitute multiple processing pipelines. Besides the configu-
rable ALU, a processing core only contains a register file and no memory. Configura-
tion and information data are stored in separate SRAMs. Because PipeRench is specially
designed for pipelined applications, best performance is achieved when pipeline stages
are identical or perfectly balanced. A specific coarse-grained reconfigurable architec-
ture, developed at the University of Twente, is the Montium processor. The Montium
will be described in more detail below and will be used throughout this chapter in the
examples of coarse-grained reconfigurable processors.
15.2.2.2 The Montium
The Montium is described in detail in [27], and in this section the general structure is
discussed. A single Montium processing tile is depicted in Figure 15.2 .
The lower part of Figure 15.2 shows the communication and configuration unit
(CCU), which deals with the off-tile communication and configuration of the upper
part, the reconfigurable tile processor (TP). The TP is the computing part that can be
dynamically reconfigured to implement a particular algorithm. At first glance the TP
has a VLIW structure. However, the control structure of the Montium is very different.
For (energy) efficiency it is imperative to minimize the control overhead. This is, for
example, accomplished by scheduling instructions statically at compile time. A relatively
simple sequencer controls the entire tile processor. The sequencer selects configurable
tile instructions that are stored in the instruction decoding block (see Figure 15.2).
Search WWH ::




Custom Search