Highly Ef fi cient Instruction Set Architecture
Since the beginning of the RISC architecture, all the RISC processor had adopted a
32-bit fixed-length instruction set architecture (ISA). However, such a RISC ISA
required larger-size codes than a conventional CISC (complicated instruction set
computer) ISA, and it was necessary to increase the capacity of program memories
and an instruction cache to support this, and efficiency decreased. SH architecture
with the 16-bit fixed-length ISA was defined in such a situation to achieve compact
code sizes. The 16-bit fixed-length ISA was spread to other processors such as ARM
Thumb and MIPS16.
On the other hand, a CISC ISA has been variable length to define the instructions
of various complexities from simple to complicated ones. The variable length is
good for realizing the compact code sizes, but is not suitable for parallel decoding
of plural instructions for the superscalar issue. Therefore, the 16-bit fixed-length
ISA is good both for the compact code sizes and the superscalar architecture.
As always, there should be pros and cons of the selection, and there are some draw-
backs of the 16-bit fixed-length ISA, which are the restriction of the number of oper-
ands and the short literal length in the code. For example, an instruction of a binary
operation modifies one of its operand, and an extra data transfer instruction is neces-
sary if the original value of the modified operand must be kept. A literal load instruc-
tion is necessary to utilize a longer literal than that in an instruction. Further, there is
an instruction using an implicitly defined register, which contributes to increase the
number of operand with no extra operand field, but requires special treatment to iden-
tify it and spoils orthogonal characteristics of the register number decoding. Therefore,
careful implementation is necessary to treat such special features.
Since a conventional superscalar processor gave priority to performance, the super-
scalar architecture was considered to be inefficient, and scalar architecture was still
popular for embedded processors. However, this is not always true. For the SH-4
design, the superscalar architecture was tuned by selecting an appropriate micro-
architecture with considering efficiency seriously for an embedded processor.
Table 3.1 summarizes the selection result of the microarchitecture.
At first, dual-issue superscalar architecture was chosen because it was difficult
for a general-purpose program to utilize the simultaneous issue of more than two
instructions effectively. Then, in-order issue architecture was chosen though
out-of-order issue architecture was popular for a high-end processor. This was
because a performance enhancement was not enough to compensate the hard-
ware increase for the out-of-order issue. The in-order dual-issue architecture
could maintain the efficiency of the conventional scalar-issue one.
Further, asymmetric superscalar architecture was chosen to duplicate resources as
few as possible to minimize the overhead and to maximize the efficiency. The symmetric
architecture was not chosen, because it required duplicating execution resources, even