Target Code Generation - Crafting a Compiler

Java Reference

In-Depth Information

An approach based on bottom-up rewriting systems (BURS) [PLG88]

theory allows very fast instruction selectors (and code generators) to be built.

Code generators built using BURS theory can be extremely fast because all

dynamic programming is done in advance when a special BURS automaton is

built. During compilation, it is only necessary to make two traversals of the

IR tree: one bottom-up traversal to label each node with a state that encodes

all optimal matches and a second top-down traversal that uses these states

to select and generate code. It has been reported that careful encodings can

produce an automaton that executes fewer than 90 RISC instructions per node

to do both traversals.

The automaton that labels the tree is a simple finite state machine, similar

to that used in shift-reduce parsers (Chapter 6). A bottom-up walk of the tree

is performed, and the label for any given node is determined by a table lookup

given the operator at the node and the states that label each of its children. The

automaton that emits code is equally simple in design. The code to be emitted

is determined by the state that labels a node and by the nonterminal to which

that node should be reduced—another table lookup.

As an example, the instruction patterns of Figure 13.27 would all be given

acostof1exceptformul, which would be given a cost of 3. This is because

mul is actually implemented by the MIPS assembler using three hardware

instructions, whereas all the other instructions are implemented using a single

instruction. Returning to the example of Figure 13.26, all the leaves would

be labeled with a state indicating that no reductions of individual leaf nodes

are possible. Visiting i's parent with its state, the fetch would be labeled with

a state indicting that application of an lw pattern is possible, at a cost of 1.

Going in turn to its parent (a * operator), the state reached would show that,

although two reductions are possible (patterns for both mul and sll match),

the two reductions are not of equal cost. sll is cheaper and will apply. That

is, the instruction selector has recognized a well-known trick: multiplication

by a power of two can often be implemented more e

ciently by doing a left

shift rather than an explicit multiply.

The automaton continues labeling the rest of the nodes; the remaining

matches are identical to those illustrated in Figure 13.28. The state labeling the

root tells us that the final instruction to be generated (to implement the assign-

ment) will be an sw. The two subtrees are visited to generate the instructions

needed to implement them. We therefore generate the root's instruction after

returning from recursive visits to both children, guaranteeing that the store's

operands are computed prior to its execution. We generate the code shown in

Figure 13.30.

Two di

ciently

generating the states and state transition tables (because all potential dynamic

programming decisions are done at table-generation time, they must be done

e

culties arise in creating a BURS-style code generator: e

ciently), and creating an e

cient encoding of the automata for use in a com-

Crafting a Compiler

Search WWH ::

Custom Search

Home