Hardware Reference
In-Depth Information
us to move loads past stores at runtime. Support for speculative memory references can
help overcome the conservatism of the compiler, but unless such approaches are used care-
fully, the overhead of the recovery mechanisms may swamp the advantages.
■ Hardware-based speculation works beter when control low is unpredictable and when
hardware-based branch prediction is superior to software-based branch prediction done at
compile time. These properties hold for many integer programs. For example, a good static
predictor has a misprediction rate of about 16% for four major integer SPEC92 programs,
and a hardware predictor has a misprediction rate of under 10%. Because speculated in-
structions may slow down the computation when the prediction is incorrect, this diferen-
ce is significant. One result of this difference is that even statically scheduled processors
normally include dynamic branch predictors.
■ Hardware-based speculation maintains a completely precise exception model even for
speculated instructions. Recent software-based approaches have added special support to
allow this as well.
■ Hardware-based speculation does not require compensation or bookkeeping code, which
is needed by ambitious software speculation mechanisms.
■ Compiler-based approaches may benefit from the ability to see further in the code se-
quence, resulting in beter code scheduling than a purely hardware-driven approach.
■ Hardware-based speculation with dynamic scheduling does not require different code se-
quences to achieve good performance for different implementations of an architecture. Al-
though this advantage is the hardest to quantify, it may be the most important in the long
run. Interestingly, this was one of the motivations for the IBM 360/91. On the other hand,
more recent explicitly parallel architectures, such as IA-64, have added flexibility that re-
duces the hardware dependence inherent in a code sequence.
The major disadvantage of supporting speculation in hardware is the complexity and ad-
ditional hardware resources required. This hardware cost must be evaluated against both the
complexity of a compiler for a software-based approach and the amount and usefulness of the
simpliications in a processor that relies on such a compiler.
Some designers have tried to combine the dynamic and compiler-based approaches to
achieve the best of each. Such a combination can generate interesting and obscure interactions.
For example, if conditional moves are combined with register renaming, a subtle side efect
appears. A conditional move that is annulled must still copy a value to the destination register,
since it was renamed earlier in the instruction pipeline. These subtle interactions complicate
the design and verification process and can also reduce performance.
The Intel Itanium processor was the most ambitious computer ever designed based on the
software support for ILP and speculation. It did not deliver on the hopes of the designers,
especially for general-purpose, nonscientific code. As designers' ambitions for exploiting ILP
were reduced in light of the difficulties discussed in Section 3.10 , most architectures setled on
hardware-based mechanisms with issue rates of three to four instructions per clock.
Speculative Execution And The Memory System
Inherent in processors that support speculative execution or conditional instructions is the
possibility of generating invalid addresses that would not occur without speculative execu-
tion. Not only would this be incorrect behavior if protection exceptions were taken, but the
beneits of speculative execution would be swamped by false exception overhead. Hence, the
memory system must identify speculatively executed instructions and conditionally executed
instructions and suppress the corresponding exception.
Search WWH ::




Custom Search