Hardware Reference
In-Depth Information
2.23 to 2.65 MIPS/MHz. As a result, the SH-X4 achieved 1,717 MIPS at 648 MHz.
The 648 MHz is not so high compared to the 600 MHz of the SH-X3, but the SH-X4
achieved the 648 MHz in a low-power process. Then the typical power consumption
is 106 mW, and the power efficiency reached as high as 16 GIPS/W.
3.1.8.2
Ef fi cient ISA Extension
The 16-bit fixed-length ISA of the SH cores is an excellent feature enabling a higher
code density than that of 32-bit fixed-length ISAs of conventional RISCs. However,
we made some trade-off to establish the 16-bit ISA. Operand fields are carefully
shortened to fit the instructions into the 16 bits according to the code analysis of
typical embedded programs in the early 1990s. The 16-bit ISA was the best choice
at that time and following two decades. However, required performance grew higher
and higher, and program size and treating data grew larger and larger. Therefore, we
decided to extend the ISA by some prefix codes.
The weak points of the 16-bit ISA are (1) short-immediate operand, (2) lack of
three-operand operation instructions, and (3) implicit fixed-register operand. The
short-immediate ISA uses a two-instruction sequence of a long-immediate load and
a use of the loaded-data, instead of a long-immediate instruction. A three-operand
operation becomes a two-instruction sequence of a move instruction and a two-
operand instruction. The implicit fixed-register operand makes register allocation
difficult and causes inefficient register allocations.
The popular ISA extension from the 16-bit ISA is a variable-length ISA. For example,
an IA-32 is a famous variable-length ISA, and ARM Thumb-2 is a variable-length ISA
of 16 and 32 bits. However, a variable-length instruction consists of plural unit-length
codes, and each unit-length code has plural meaning depending on the preceding codes.
Therefore, the variable-length ISA causes complicated, large, and slow parallel issue
with serial code analysis.
Another way is using prefix codes. The IA-32 uses some prefixes as well as the
variable-length instructions, and using prefix codes is one of the conventional ways.
However, if we use the prefix codes but not use the variable-length instructions, we
can implement a parallel instruction decoding easily. The SH-X4 introduced some
16-bit prefix codes to extend the 16-bit fixed-length ISA.
Figure 3.42 shows some examples of the ISA extension. The first example is an
operation “Rc = Ra + Rb (Ra, Rb, Rc: registers),” which requires a two-instruction
sequence of “MOV Ra, Rc (Rc = Ra)” and “ADD Rb, Rc (Rc + = Rb)” before extension,
but only one instruction “ADD Ra, Rb, Rc” after the extension. The new instruction is
made of the “ADD Ra, Rb” by a prefix to change a destination register operand Rb to
a new register operand Rc. The code sizes are the same, but the number of issue slots
reduces from two to one. Then the next instruction can be issued simultaneously if
there is no other pipeline stall factor.
The second example is an operation “Rc = @(Ra + Rb),” which requires a two-
instruction sequence of “MOV Rb, R0 (R0 = Rb)” and “MOV.L @(Ra, R0), Rc
(Rc = @(Ra + R0))” before extension, but only an instruction “MOV.L @(Ra, Rb),
Search WWH ::




Custom Search