Instruction-Level Parallelism and Its Exploitation - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

us to move loads past stores at runtime. Support for speculative memory references can

help overcome the conservatism of the compiler, but unless such approaches are used care-

fully, the overhead of the recovery mechanisms may swamp the advantages.

■ Hardware-based speculation works beter when control low is unpredictable and when

hardware-based branch prediction is superior to software-based branch prediction done at

compile time. These properties hold for many integer programs. For example, a good static

predictor has a misprediction rate of about 16% for four major integer SPEC92 programs,

and a hardware predictor has a misprediction rate of under 10%. Because speculated in-

structions may slow down the computation when the prediction is incorrect, this diferen-

ce is significant. One result of this difference is that even statically scheduled processors

normally include dynamic branch predictors.

■ Hardware-based speculation maintains a completely precise exception model even for

speculated instructions. Recent software-based approaches have added special support to

allow this as well.

■ Hardware-based speculation does not require compensation or bookkeeping code, which

is needed by ambitious software speculation mechanisms.

■ Compiler-based approaches may benefit from the ability to see further in the code se-

quence, resulting in beter code scheduling than a purely hardware-driven approach.

■ Hardware-based speculation with dynamic scheduling does not require different code se-

quences to achieve good performance for different implementations of an architecture. Al-

though this advantage is the hardest to quantify, it may be the most important in the long

run. Interestingly, this was one of the motivations for the IBM 360/91. On the other hand,

more recent explicitly parallel architectures, such as IA-64, have added flexibility that re-

duces the hardware dependence inherent in a code sequence.

The major disadvantage of supporting speculation in hardware is the complexity and ad-

ditional hardware resources required. This hardware cost must be evaluated against both the

complexity of a compiler for a software-based approach and the amount and usefulness of the

simpliications in a processor that relies on such a compiler.

Some designers have tried to combine the dynamic and compiler-based approaches to

achieve the best of each. Such a combination can generate interesting and obscure interactions.

For example, if conditional moves are combined with register renaming, a subtle side efect

appears. A conditional move that is annulled must still copy a value to the destination register,

since it was renamed earlier in the instruction pipeline. These subtle interactions complicate

the design and verification process and can also reduce performance.

The Intel Itanium processor was the most ambitious computer ever designed based on the

software support for ILP and speculation. It did not deliver on the hopes of the designers,

especially for general-purpose, nonscientific code. As designers' ambitions for exploiting ILP

were reduced in light of the difficulties discussed in Section 3.10 , most architectures setled on

hardware-based mechanisms with issue rates of three to four instructions per clock.

Speculative Execution And The Memory System

Inherent in processors that support speculative execution or conditional instructions is the

possibility of generating invalid addresses that would not occur without speculative execu-

tion. Not only would this be incorrect behavior if protection exceptions were taken, but the

beneits of speculative execution would be swamped by false exception overhead. Hence, the

memory system must identify speculatively executed instructions and conditionally executed

instructions and suppress the corresponding exception.

Computer Architecture: A Quantitative Approach

Search WWH ::

Custom Search

Home