Instruction Set Principles - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

optimization finds two instances of an expression that compute the same value and saves the

value of the first computation in a temporary. It then uses the temporary value, eliminating

the second computation of the common expression.

For this optimization to be significant, the temporary must be allocated to a register. Other-

wise, the cost of storing the temporary in memory and later reloading it may negate the sav-

ings gained by not recomputing the expression. There are, in fact, cases where this optimiz-

ation actually slows down code when the temporary is not register allocated. Phase ordering

complicates this problem because register allocation is typically done near the end of the glob-

al optimization pass, just before code generation. Thus, an optimizer that performs this optim-

ization must assume that the register allocator will allocate the temporary to a register.

Optimizations performed by modern compilers can be classified by the style of the trans-

formation, as follows:

■ High-level optimizations are often done on the source with output fed to later optimization

passes.

■ Local optimizations optimize code only within a straight-line code fragment (called a basic

block by compiler people).

■ Global optimizations extend the local optimizations across branches and introduce a set of

transformations aimed at optimizing loops.

■ Register allocation associates registers with operands.

■ Processor-dependent optimizations atempt to take advantage of speciic architectural know-

ledge.

Register Allocation

Because of the central role that register allocation plays, both in speeding up the code and in

making other optimizations useful, it is one of the most important—if not the most import-

ant—of the optimizations. Register allocation algorithms today are based on a technique called

graph coloring . The basic idea behind graph coloring is to construct a graph representing the

possible candidates for allocation to a register and then to use the graph to allocate registers.

Roughly speaking, the problem is how to use a limited set of colors so that no two adjacent

nodes in a dependency graph have the same color. The emphasis in the approach is to achieve

100% register allocation of active variables. The problem of coloring a graph in general can

take exponential time as a function of the size of the graph (NP-complete). There are heuristic

algorithms, however, that work well in practice, yielding close allocations that run in near-lin-

ear time.

Graph coloring works best when there are at least 16 (and preferably more) general-purpose

registers available for global allocation for integer variables and additional registers for loat-

ing point. Unfortunately, graph coloring does not work very well when the number of re-

gisters is small because the heuristic algorithms for coloring the graph are likely to fail.

Impact Of Optimizations On Performance

It is sometimes difficult to separate some of the simpler optimizations—local and processor-

dependent optimizations—from transformations done in the code generator. Examples of typ-

ical optimizations are given in Figure A.20 . The last column of Figure A.20 indicates the fre-

quency with which the listed optimizing transforms were applied to the source program.

Search WWH ::

Custom Search

Home