Software Compilation Techniques for MPSoCs - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

in a (separate) concurrent process. With few exceptions such as goto statements

and dynamic memory allocations, any compound statement in C is allowed inside

the thread scope. With minimum code modifications, the sequential C code is

quickly turned into a parallel specification of the application. The TCT execution

model is a program-driven MIMD (Multiple Instruction Multiple Data) model

where each thread is statically allocated to one processor. The execution model

has multiple flows of control simultaneously, which enables modeling different

parallelism schemes such as functional pipelining and task parallelism. Three

thread synchronization instructions implement the decentralized thread interaction

in the generated parallel code. The TCT compiler takes the input of annotated C

source code, performs the dependency analysis among the threads and generates

the parallel code with inserted thread communications to guarantee the correct

concurrent execution. The TCT MPSoC, which is based on an existing AMBA

SoC platform with a homogeneous distributed-memory multiprocessor-array, is

supported as backend for code generation: a hardware prototype and a software

tool called TCT trace scheduler are available for the designer to simulate and get

the performance results given the partitioning.

MPA

The MPSoC Parallelization Assist (MPA) tool [ 60 ] is developed at IMEC to help

designers map the applications onto an embedded multicore platform efficiently, and

explore the solution space fast. The exploration flow's starting point is the sequential

C source code of an application. The initial parallelization specification is assisted

by profiling the application using the instruction set simulator (ISS) or source code

instrumentation and execution on a target platform. The key input to the tool is the

Parallelization Specification (ParSpec), which is a separate file from the application

source code and orchestrates the parallelization. The ParSpec is composed of one or

more parallel sections to specify the computational partitioning of the application:

outside a parallel section the code is sequentially executed, whereas inside a parallel

section all code must be assigned to at least one thread. Both functional and data-

level splits can be expressed in the ParSpec. Keeping the ParSpec separate from the

application code allows the designer to try out different mapping strategies easily.

The parallel C code is then generated by the MPA tool, where scalar dataflow

analysis is done to insert communication primitives into the generated code. The

shared variables have to be explicitly specified by the designer and the necessary

synchronization has to be added using e.g. LoopSync . The MPA tool targets

two execution environments for the generated parallel code—hardware or virtual

platform in case the platform is available, and a High-Level Simulator in case the

platform is not ready. The execution trace is generated, which the designer analyzes

to decide whether changes are needed for the ParSpec to improve the mapping result.

Search WWH ::

Custom Search

Home