Hardware Reference
In-Depth Information
ate unpredictable delays, such as cache misses, by executing other code while waiting for the
miss to resolve. In Section 3.6 , we explore hardware speculation, a technique with additional
performance advantages, which builds on dynamic scheduling. As we will see, the advantages
of dynamic scheduling are gained at a cost of significant increase in hardware complexity.
Although a dynamically scheduled processor cannot change the data flow, it tries to avoid
stalling when dependences are present. In contrast, static pipeline scheduling by the compiler
(covered in Section 3.2 ) tries to minimize stalls by separating dependent instructions so that
they will not lead to hazards. Of course, compiler pipeline scheduling can also be used on code
destined to run on a processor with a dynamically scheduled pipeline.
Dynamic Scheduling: The Idea
A major limitation of simple pipelining techniques is that they use in-order instruction issue
and execution: Instructions are issued in program order, and if an instruction is stalled in the
pipeline no later instructions can proceed. Thus, if there is a dependence between two closely
spaced instructions in the pipeline, this will lead to a hazard and a stall will result. If there are
multiple functional units, these units could lie idle. If instruction j depends on a long-running
instruction i , currently in execution in the pipeline, then all instructions after j must be stalled
until i is finished and j can execute. For example, consider this code:
DIV.D F0,F2,F4
ADD.D F10,F0,F8
SUB.D F12,F8,F14
The SUB.D instruction cannot execute because the dependence of ADD.D on DIV.D causes the
pipeline to stall; yet, SUB.D is not data dependent on anything in the pipeline. This hazard cre-
ates a performance limitation that can be eliminated by not requiring instructions to execute
in program order.
In the classic five-stage pipeline, both structural and data hazards could be checked during
instruction decode (ID): When an instruction could execute without hazards, it was issued
from ID knowing that all data hazards had been resolved.
To allow us to begin executing the SUB.D in the above example, we must separate the issue
process into two parts: checking for any structural hazards and waiting for the absence of a
data hazard. Thus, we still use in-order instruction issue (i.e., instructions issued in program
order), but we want an instruction to begin execution as soon as its data operands are avail-
able. Such a pipeline does out-of-order execution , which implies out-of-order completion .
Out-of-order execution introduces the possibility of WAR and WAW hazards, which do not
exist in the five-stage integer pipeline and its logical extension to an in-order floating-point
pipeline. Consider the following MIPS floating-point code sequence:
DIV.D F0,F2,F4
ADD.D F6,F0,F8
SUB.D F8,F10,F14
MUL.D F6,F10,F8
There is an antidependence between the ADD.D and the SUB.D , and if the pipeline executes the
SUB.D before the ADD.D (which is waiting for the DIV.D ), it will violate the antidependence, yield-
ing a WAR hazard. Likewise, to avoid violating output dependences, such as the write of F6 by
MUL.D , WAW hazards must be handled. As we will see, both these hazards are avoided by the
use of register renaming.
Out-of-order completion also creates major complications in handling exceptions. Dynamic
scheduling with out-of-order completion must preserve exception behavior in the sense that
Search WWH ::




Custom Search