Java Reference
In-Depth Information
13.5 Automatic Instruction Selection
An important aspect of code generation is instruction selection .Afteratrans-
lation for a particular construct is determined, the machine-level instructions
that implement the translation must be chosen. Thus, we may decide to im-
plement a switch statement using a jump table. If so, instructions that index
into the jump table and execute an indirect jump must be generated.
Often several di
erent instruction sequences can implement a particular
translation. Even something as simple as a+1 can be implemented by load-
ing 1 into a register and generating an add instruction, or by generating an
increment or add-immediate instruction. We normally want the smallest or
fastest instruction sequence. Thus, an add-immediate instruction is preferred
because it avoids an explicit load.
In simple RISC architectures, the choice of potential instruction sequences
is limited because almost all operands must be loaded into registers before
they can be used (immediate operands being a notable exception). Further,
the variety of addressing modes provided is also spartan; often only absolute
and indexed addresses are allowed.
Older architectures, like the Motorola 680x0 and Intel x86, are much more
elaborate. Many di
ff
erent operation codes are provided, and a wide variety of
addressing modes are available. Operands do not always need to be loaded
into registers; addressing modes can fetch operands indirectly and can incre-
ment and decrement registers. Di
ff
ff
erent register classes (e.g., address registers
and data registers) are used in di
erent instructions (in a non-interchangeable
manner) and particular registers are sometimes “wired into” certain instruc-
tions. The JVM also has a relatively elaborate instruction set because its design
was based on achieving compact code sequences. As a result, there are several
ways to increment the contents of a register, but one sequence of bytecode
instructions achieves that e
ff
ect most compactly.
For very complex architectures, amethod of systematizing and automating
instruction selection is vital. Even for simpler architectures, itmay be necessary
to “extend” a code generator when a successor architecture is introduced. Very
ambitious compilersmay aimto compile intomore than one target architecture,
mandating alternative instruction sequences for di
ff
erent target machines.
Instruction selection is often simplified by translating source language con-
structs into a very low-level tree-structured intermediate representation (IR).
In this IR, leaves represent registers, memory locations, or literal values,
and internal nodes represent basic operations on operand values. Detailed
data access patterns and manipulations are exposed. Consider the statement
b[i]=a+1,whereb is an array of integers, i is a global integer variable, and a is
a local variable accessed through the frame register, $fp. The statement's tree-
structured IR is shown in Figure 13.26. This representation is very similar to
ff
 
 
Search WWH ::




Custom Search