Hardware Reference
In-Depth Information
Speculative execution introduces some interesting problems. For one, it is es-
sential that none of the speculative instructions have irrevocable results because it
may turn out later that they should not have been executed. In Fig. 4-45, it is fine
to fetch evensum and oddsum , and it is also fine to do the addition as soon as k is
available (even before the if statement), but it is not fine to store the results back in
memory. In more complicated code sequences, one common way of preventing
speculative code from overwriting registers before it is known if this is desired, is
to rename all the destination registers used by the speculative code. In this way,
only scratch registers are modified, so there is no problem if the code ultimately is
not needed. If the code is needed, the scratch registers are copied to the true desti-
nation registers. As you can imagine, the scoreboarding to keep track of all this is
not simple, but given enough hardware, it can be done.
However, there is another problem introduced by speculative code that cannot
be solved by register renaming. What happens if a speculatively executed instruc-
tion causes an exception? A painful, but not fatal, example is a LOAD instruction
that causes a cache miss on a machine with a large cache line size (say, 256 bytes)
and a memory far slower than the CPU and cache. If a LOAD that is actually need-
ed stops the machine dead in its tracks for many cycles while the cache line is
being loaded, well, that's life, since the word is needed. However, stalling the ma-
chine to fetch a word that turns out not to be needed is counterproductive. Too
many of these ''optimizations'' may make the CPU slower than if it did not have
them at all. (If the machine has virtual memory, which is discussed in Chap. 6, a
speculative LOAD might even cause a page fault, which requires a disk operation to
bring in the needed page. False page faults can have a terrible effect on per-
formance, so it is important to avoid them.)
One solution present in a number of modern machines is to have a special
SPECULATIVE-LOAD instruction that tries to fetch the word from the cache, but if it
is not there, just gives up. If the value is there when it is actually needed, it can be
used, but if it is not, the hardware must go out and get it on the spot. If the value
turns out not to be needed, no penalty has been paid for the cache miss.
A far worse situation can be illustrated with the following statement:
if (x > 0) z = y/x;
where x , y , and z are floating-point variables. Suppose that the variables are all
fetched into registers in advance and that the (slow) floating-point division is
hoisted above the if test. Unfortunately, x is 0 and the resulting divide-by-zero trap
terminates the program. The net result is that speculation has caused a correct pro-
gram to fail. Worse yet, the programmer put in explicit code to prevent this situa-
tion and it happened anyway. This situation is not likely to lead to a happy pro-
grammer.
One possible solution is to have special versions of instructions that might
cause exceptions. In addition, a bit, called a poison bit , is added to each register.
When a special speculative instruction fails, instead of causing a trap, it sets the
 
Search WWH ::




Custom Search