Digital Signal Processing Reference
In-Depth Information
in a (separate) concurrent process. With few exceptions such as goto statements
and dynamic memory allocations, any compound statement in C is allowed inside
the thread scope. With minimum code modifications, the sequential C code is
quickly turned into a parallel specification of the application. The TCT execution
model is a program-driven MIMD (Multiple Instruction Multiple Data) model
where each thread is statically allocated to one processor. The execution model
has multiple flows of control simultaneously, which enables modeling different
parallelism schemes such as functional pipelining and task parallelism. Three
thread synchronization instructions implement the decentralized thread interaction
in the generated parallel code. The TCT compiler takes the input of annotated C
source code, performs the dependency analysis among the threads and generates
the parallel code with inserted thread communications to guarantee the correct
concurrent execution. The TCT MPSoC, which is based on an existing AMBA
SoC platform with a homogeneous distributed-memory multiprocessor-array, is
supported as backend for code generation: a hardware prototype and a software
tool called TCT trace scheduler are available for the designer to simulate and get
the performance results given the partitioning.
MPA
The MPSoC Parallelization Assist (MPA) tool [ 60 ] is developed at IMEC to help
designers map the applications onto an embedded multicore platform efficiently, and
explore the solution space fast. The exploration flow's starting point is the sequential
C source code of an application. The initial parallelization specification is assisted
by profiling the application using the instruction set simulator (ISS) or source code
instrumentation and execution on a target platform. The key input to the tool is the
Parallelization Specification (ParSpec), which is a separate file from the application
source code and orchestrates the parallelization. The ParSpec is composed of one or
more parallel sections to specify the computational partitioning of the application:
outside a parallel section the code is sequentially executed, whereas inside a parallel
section all code must be assigned to at least one thread. Both functional and data-
level splits can be expressed in the ParSpec. Keeping the ParSpec separate from the
application code allows the designer to try out different mapping strategies easily.
The parallel C code is then generated by the MPA tool, where scalar dataflow
analysis is done to insert communication primitives into the generated code. The
shared variables have to be explicitly specified by the designer and the necessary
synchronization has to be added using e.g. LoopSync . The MPA tool targets
two execution environments for the generated parallel code—hardware or virtual
platform in case the platform is available, and a High-Level Simulator in case the
platform is not ready. The execution trace is generated, which the designer analyzes
to decide whether changes are needed for the ParSpec to improve the mapping result.
Search WWH ::




Custom Search