Hardware Reference
In-Depth Information
Table 3.1
Microarchitecture selections of SH-4
Selections
Other candidates
Merits
Number of issues
Dual
Scalar, triple, quad
Maintaining
high ef fi ciency
Issue order
In-order
Out-of-order
Resource duplication
Asymmetric
Duplicated (symmetric)
Important category
Transfer
Memory access,
arithmetic
Good for two-
operand ISA
Latency concealing
Zero-cycle transfer
Delayed execution,
store buffers
Internal memories
Harvard architecture
Uni fi ed cache
Simultaneous access
Branch acceleration
Delayed branch,
early-stage branch
Branch prediction,
out-of-order issue,
branch target buffer,
separated instructions
Simple, small,
compatible
the duplicated resources were not often used simultaneously, and the architecture would
not achieve high efficiency.
All the instructions were categorized to reduce a pipeline hazard by the resource
conflicts, which would not occur in symmetric architecture with the expense of the
resource duplication. Especially, a transfer instruction of a literal or register value is
important for the 16-bit fixed-length ISA, and the transfer instructions were catego-
rized as a type that could utilize both execution and load/store pipelines properly.
Further a zero-cycle transfer operation was implemented for the transfer instruc-
tions and contributes to reduce the hazard.
As for memory architecture, Harvard architecture was popular for PC/server pro-
cessors enabling simultaneous accesses to instruction and data caches, and unified
cache architecture was popular for embedded processors to reduce the hardware
cost and to utilize relatively small size cache efficiently. The SH-4 adopted the
Harvard architecture, which was necessary to avoid the memory access conflict
increased by the superscalar issue.
The SH architecture adopted a delayed branch to reduce the branch penalty
cycles. In addition, the SH-4 adopted an early-stage branch to reduce the penalty
further. The penalty cycles increased with the superscalar issue, but were not so
much as that of a superpipeline processor having deep pipeline stages, and the SH-4
did not adopt more expensive ways such as a branch target buffer (BTB), an out-of-
order issue of a branch instruction, and a branch prediction. The SH-4 kept the
backward compatibility and did not adopt a method with ISA change like a method
using plural instructions for a branch.
As the result of the selection, the SH-4 adopted an in-order dual-issue asymmet-
ric five-stage superscalar pipeline and Harvard architecture with special treatment
of transfer instructions including zero-cycle transfer method.
 
Search WWH ::




Custom Search