Digital Signal Processing Reference
In-Depth Information
(register allocation). Scheduling refers to the process of organizing the instructions
in a timed sequence. The schedule can be computed statically (for RISC, DSPs
and VLIWs) or dynamically at runtime (for Superscalars) whereas the mapping
of operations to instructions is always computed statically. The main purpose of
mapping and scheduling in uni-processor compilers had been always to improve
performance. Code size is also an important objective for embedded (specially
VLIW) processors. Only recently, power consumption became an issue. However,
the reduction in power consumption with back end techniques does not have a big
impact on the overall system power consumption.
In an MPSoC compiler similar operations have to be performed. Mapping, in
this context, refers to the process of assigning code blocks to PEs and logical
communication links to physical ones. In contrast to the uni-processor case,
mapping can be also dynamic. A code block could be mapped at runtime to different
PEs, depending on availability of resources. Scheduling for MPSoCs has a similar
meaning as for uni-processors, but instead of scheduling instructions, the compiler
has to schedule code blocks. The presence of different application classes, e.g. real
time, add complexity to the optimizations in the compiler. Particularly, there is
much more room for improving power consumption in an MPSoC; after all, power
consumption is one of the MPSoC drivers in the first place.
The result of scheduling and mapping is typically represented in form of a Gantt
Chart , similar to the ones presented in Fig. 10 . The PEs are represented in the
vertical axis and the time in the horizontal axis. Code blocks are located in the plane,
according to the mapping and the scheduling information. In Fig. 10 a functions f1
and f2 are mapped to processor RISC1, the functions f3 and sum are mapped to
processor RISC2 and function f4 to processor RISC3.
Given that code blocks have a higher time variability than instructions, schedul-
ing can be rarely performed statically. Pure static scheduling requires full knowledge
of the timing behavior and is only possible for very predictable architectures and
regular computations, like in the case of systolic arrays [ 48 ] . If it is not possible to
obtain a pure static schedule, some kind of synchronization is needed. Different
scheduling approaches require different synchronization schemes with different
associated performance overhead. In the example of the previous section, the timing
information of task T3 is not known precisely as shown in Fig. 10 b . Therefore the
exact starting time of function sum cannot be determined and a synchronization
primitive has to be inserted to ensure correctness of the result. In this example, a
simple barrier is enough in order to ensure that the execution of T3 in processors
RISC3, RISC4 and RISC5 has finished before executing function sum .
2.4.1
Scheduling Approaches
Which scheduling approach to utilize depends on the characteristics of the applica-
tion and the properties of the underlying MoC used to describe it. Apart from pure
static schedules, one can distinguish among the following scheduling approaches:
Search WWH ::




Custom Search