Hardware Reference
In-Depth Information
optimization finds two instances of an expression that compute the same value and saves the
value of the first computation in a temporary. It then uses the temporary value, eliminating
the second computation of the common expression.
For this optimization to be significant, the temporary must be allocated to a register. Other-
wise, the cost of storing the temporary in memory and later reloading it may negate the sav-
ings gained by not recomputing the expression. There are, in fact, cases where this optimiz-
ation actually slows down code when the temporary is not register allocated. Phase ordering
complicates this problem because register allocation is typically done near the end of the glob-
al optimization pass, just before code generation. Thus, an optimizer that performs this optim-
ization must assume that the register allocator will allocate the temporary to a register.
Optimizations performed by modern compilers can be classified by the style of the trans-
formation, as follows:
High-level optimizations are often done on the source with output fed to later optimization
passes.
Local optimizations optimize code only within a straight-line code fragment (called a basic
block by compiler people).
Global optimizations extend the local optimizations across branches and introduce a set of
transformations aimed at optimizing loops.
Register allocation associates registers with operands.
Processor-dependent optimizations atempt to take advantage of speciic architectural know-
ledge.
Register Allocation
Because of the central role that register allocation plays, both in speeding up the code and in
making other optimizations useful, it is one of the most important—if not the most import-
ant—of the optimizations. Register allocation algorithms today are based on a technique called
graph coloring . The basic idea behind graph coloring is to construct a graph representing the
possible candidates for allocation to a register and then to use the graph to allocate registers.
Roughly speaking, the problem is how to use a limited set of colors so that no two adjacent
nodes in a dependency graph have the same color. The emphasis in the approach is to achieve
100% register allocation of active variables. The problem of coloring a graph in general can
take exponential time as a function of the size of the graph (NP-complete). There are heuristic
algorithms, however, that work well in practice, yielding close allocations that run in near-lin-
ear time.
Graph coloring works best when there are at least 16 (and preferably more) general-purpose
registers available for global allocation for integer variables and additional registers for loat-
ing point. Unfortunately, graph coloring does not work very well when the number of re-
gisters is small because the heuristic algorithms for coloring the graph are likely to fail.
Impact Of Optimizations On Performance
It is sometimes difficult to separate some of the simpler optimizations—local and processor-
dependent optimizations—from transformations done in the code generator. Examples of typ-
ical optimizations are given in Figure A.20 . The last column of Figure A.20 indicates the fre-
quency with which the listed optimizing transforms were applied to the source program.
Search WWH ::




Custom Search