Java Reference
In-Depth Information
An approach based on bottom-up rewriting systems (BURS) [PLG88]
theory allows very fast instruction selectors (and code generators) to be built.
Code generators built using BURS theory can be extremely fast because all
dynamic programming is done in advance when a special BURS automaton is
built. During compilation, it is only necessary to make two traversals of the
IR tree: one bottom-up traversal to label each node with a state that encodes
all optimal matches and a second top-down traversal that uses these states
to select and generate code. It has been reported that careful encodings can
produce an automaton that executes fewer than 90 RISC instructions per node
to do both traversals.
The automaton that labels the tree is a simple finite state machine, similar
to that used in shift-reduce parsers (Chapter 6). A bottom-up walk of the tree
is performed, and the label for any given node is determined by a table lookup
given the operator at the node and the states that label each of its children. The
automaton that emits code is equally simple in design. The code to be emitted
is determined by the state that labels a node and by the nonterminal to which
that node should be reduced—another table lookup.
As an example, the instruction patterns of Figure 13.27 would all be given
acostof1exceptformul, which would be given a cost of 3. This is because
mul is actually implemented by the MIPS assembler using three hardware
instructions, whereas all the other instructions are implemented using a single
instruction. Returning to the example of Figure 13.26, all the leaves would
be labeled with a state indicating that no reductions of individual leaf nodes
are possible. Visiting i's parent with its state, the fetch would be labeled with
a state indicting that application of an lw pattern is possible, at a cost of 1.
Going in turn to its parent (a * operator), the state reached would show that,
although two reductions are possible (patterns for both mul and sll match),
the two reductions are not of equal cost. sll is cheaper and will apply. That
is, the instruction selector has recognized a well-known trick: multiplication
by a power of two can often be implemented more e
ciently by doing a left
shift rather than an explicit multiply.
The automaton continues labeling the rest of the nodes; the remaining
matches are identical to those illustrated in Figure 13.28. The state labeling the
root tells us that the final instruction to be generated (to implement the assign-
ment) will be an sw. The two subtrees are visited to generate the instructions
needed to implement them. We therefore generate the root's instruction after
returning from recursive visits to both children, guaranteeing that the store's
operands are computed prior to its execution. We generate the code shown in
Figure 13.30.
Two di
ciently
generating the states and state transition tables (because all potential dynamic
programming decisions are done at table-generation time, they must be done
e
culties arise in creating a BURS-style code generator: e
ciently), and creating an e
cient encoding of the automata for use in a com-
 
Search WWH ::




Custom Search