Design Space Exploration of Parallel Architectures - Multi-objective Design Space Exploration of Multiprocessor SoC Architectures

Hardware Reference

In-Depth Information

node

R

MCU

node

R

node

R

MCU

node

R

router

Fig. 8.6 The ICT Many-Core architecture ( transformer )

which implements the interconnection network. Specifically, each node implements

a two-issue out-of-order pipelined processor architecture with an independent pro-

gram counter and a two-level cache hierarchy. The memory sub-system is distributed

(thus not shared among cores) to simplify the architecture of the system.

8.3.2

Node Architecture and Instruction Set

For each node there is a separate level 1 instruction/data cache and a unified level 2

cache in each node. As there is need for 64-bit computing in multimedia applications,

64-bit MIPS-III ISA is chosen for the node. Some MAC (Multiply-Accumulate)

instructions which need three source operands are also supported by node.

Each node implements a two-issue out-of-order pipelined processor architecture,

as shown in Fig. 8.7 . The pipeline is divided into 6 stages: fetch, decode, map,

issue, execute and write back. For memory operations, there is one more stage data

cache between execute and write back . In order to achieve reasonable performance

and implement a low complexity node, a scoreboard -based out-of-order pipeline

based is used. The scoreboard unit in each node is responsible for accepting decoded

instructions from the map stage, and issuing them to the functional units (address

generators, ALUs and FPUs) satisfying dependencies among the instructions. To

achieve this goal the main element of the scoreboard is the instruction queue which

holds decoded instructions and issues them once the resources they require are free

and their dependencies have been cleared.

Search WWH ::

Custom Search

Home