Digital Signal Processing Reference
In-Depth Information
MRRG-based compiler techniques are easily retargetable to a wide range of
architectures, such as those of the ADRES template, and they can support many
programming languages. Different architectures can simply be modeled with differ-
ent MRRGs. It has even been demonstrated that by using the appropriate modulo
constraints during the mapping of a DDG on a MRRG, compilers can generate a
single code version that can be executed on CGRAs of different sizes [ 58 ] . This
is particularly interesting for the PPA architecture that can switch dynamically
between different array sizes [ 56 ] . To support different programming languages
like C and Fortran, the techniques only require a compiler front-end that is able to
generate DDGs for the loop bodies. Obviously, the appropriate loop transformations
need to be applied before generating the DDG in order to generate one that maps
well onto the MRRG of the architecture. Such loop transformations are discussed in
detail in Sect. 4.1 .
The aforementioned algorithms have been extended to not only consider the costs
of utilized resources inside the CGRA during scheduling, but to also consider bank
conflicts that may occur because of multiple memory accesses being scheduled in
thesamecycle[ 34 , 35 ] .
Many other CGRA compiler techniques have been proposed, most of which are
restricted to specific architectures. Static reconfigurable architectures like RaPiD
and PACT have been targeted by compiler algorithms [ 14 , 22 , 76 ] based on
placement and routing techniques that also map DDGs on RRGs. These techniques
support subsets of the C programming language (no pointers, no structs,
)and
require the use of special C functions to program the IO in the loop bodies to be
mapped onto the CGRA. The latter requirement follows from the specific IO support
in the architectures and the modeling thereof in the RRGs.
For the MorphoSys architecture, with its emphasis on SIMD across ISs, compiler
techniques have been developed for the SA-C language [ 73 ] . In this language the
supported types of available parallelism are specified by means of loop language
constructs. These constructs are translated into control code for the CGRA, which
are mapped onto the ISs together with the DDGs of the loop bodies.
CGRA code generation techniques based on integer-linear programming have
been proposed for the several architectures, both for spatial [ 2 ] and for temporal
mapping [ 41 , 77 ] . Basically, the ILP formulation consists of all the requirements or
constraints that must be met by a valid schedule. This formulation is built from a
DDG and a hardware description, and can hence be used to compile many source
languages. It is unclear, however, to what extent the ILP formulation and its solution
rely on specific architecture features, and hence to which extent it would be possible
to retarget the ILP-formulation to different CGRA designs. A similar situation
occurs for the constraint-based compilation method developed for the Silicon Hive
architecture template [ 67 ] , of which no detailed information is public. Furthermore,
ILP-based compilation is known to be unreasonably slow. So in practice it can only
be used for small loop kernels.
Code generation techniques for CGRAs based on instruction-selection pattern
matching and list-scheduling techniques have also been proposed [ 26 , 27 ] . It is
unclear to what extent these techniques rely on a specific architecture because we
...
Search WWH ::




Custom Search