Digital Signal Processing Reference
In-Depth Information
a
b
f1
1(b[0])
…
N(b):1
1(b[0])
1
…
N
1
T3
T3
64N(A)
64
T3
T3
15(s):4
…
1(s):4
16
64(c)
…
64
Control
Data
T1
T2
Fig. 9
dencies
defined value will be actually used. With the dependence information, it is possible
to represent the precedence constraints along the execution of the whole program
f
. With this partitioning, it is possible to identify two different kinds of parallelism:
T1
and
T2
are a good example of TLP, whereas
T3
displays DLP. This is a good
example where flow and dependence analysis help determining a partitioning that
exposes coarse grained parallelism.
Due to the complexity of static analyses, research groups have recently started
analyses, where dependencies are determined at compile time, DDFA uses traces
obtained from profiling runs. This analysis is of course not sound and cannot be
directly used to generate code. Instead, it is used to obtain a coarse measure of
the data flowing among different portions of the application in order to derive
plausible partitions and in this way identify DLP, TLP and/or PLP. Being a profile-
based technique, the quality of DDFA depends on a careful selection of the input
stimuli. In interactive programming environments, DDFA can provide hints to the
programmer about where to perform code modifications to expose more parallelism.
Summary
Traditional compilers work at the basic block granularity which is well suited for
ILP. MPSoC compilers in turn need to be equipped with powerful flow analysis
techniques, that allow to partition the application into a suitable granularity.