C Compilers and Code Optimization for DSPs - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

3.1

Address Code Optimizations

Typical code contains a large number of instruction for the computation of addresses

of data stored in memory. These auxiliary computations are necessary, but compete

for processing resources with the computations on actual user data. Hence, address

code optimizations have been developed that aim at making address computations

more efficient and, thus, increasing overall performance. In the following two

paragraphs we discuss two of these optimizations that support the compiler in

making better use of the address generation units common to many DSPs.

3.1.1

Pointer-to-Array Conversion

Many DSP architectures have specialized Address Generation Units (AGUs)

(see [ 45 ] ), but early compilers were unable to generate efficient code for them,

especially in programs containing explicit array references. Programmers, therefore,

used pointer-based accesses and pointer arithmetic within their programs in order to

give “hints” to the early compiler on how and when to use post-increment/decrement

addressing modes available in AGUs. For instance, consider example in Fig. 1 a ,

a kernel loop of the DSPstone benchmark matrix2 . Here the pointer accesses

“encourage” the compiler to utilize the post-increment address modes of the AGU

of a DSP.

If, however, further analysis and optimization is needed before code generation,

then such a formulation is problematic as such techniques often rely on explicit

array index representations and cannot cope with pointer references. In order to

maintain semantic correctness compilers use conservative optimization strategies,

i.e. many possible array access optimizations are not applied in the presence of

pointers. Obviously, this limits the maximal performance of the produced code. It

is highly desirable to overcome this drawback, without adversely affecting AGU

utilization.

Although general array access and pointer analysis are without further restric-

tions equally powerful and intractable, it is easier to find suitable restrictions of the

array data dependence problem while keeping the resulting algorithm applicable

to real-world programs. Furthermore, as array-based analysis is more mature than

pointer-based analysis within available commercial compilers, programs containing

arrays rather than pointers are more likely to be efficiently implemented.

Pointer-to-Array Conversions [ 15 ] collects information from pointer-based code

in order to regenerate the original accesses with explicit indexes that are suitable

for further analyses. Example 1 ( b) shows the loop with explicit array indexes that is

semantically equivalent to the previous loop in Example 1 ( a). Not only it is easier to

read and understand for a human reader, but it is amendable to existing array data-

flow analyses. The data-flow information collected by these analyses can be used for

e.g. redundant load/store eliminations, software-pipelining and loop parallelization.

Search WWH ::

Custom Search

Home