Instruction-Level Parallelism and Its Exploitation - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

ate unpredictable delays, such as cache misses, by executing other code while waiting for the

miss to resolve. In Section 3.6 , we explore hardware speculation, a technique with additional

performance advantages, which builds on dynamic scheduling. As we will see, the advantages

of dynamic scheduling are gained at a cost of significant increase in hardware complexity.

Although a dynamically scheduled processor cannot change the data flow, it tries to avoid

stalling when dependences are present. In contrast, static pipeline scheduling by the compiler

(covered in Section 3.2 ) tries to minimize stalls by separating dependent instructions so that

they will not lead to hazards. Of course, compiler pipeline scheduling can also be used on code

destined to run on a processor with a dynamically scheduled pipeline.

Dynamic Scheduling: The Idea

A major limitation of simple pipelining techniques is that they use in-order instruction issue

and execution: Instructions are issued in program order, and if an instruction is stalled in the

pipeline no later instructions can proceed. Thus, if there is a dependence between two closely

spaced instructions in the pipeline, this will lead to a hazard and a stall will result. If there are

multiple functional units, these units could lie idle. If instruction j depends on a long-running

instruction i , currently in execution in the pipeline, then all instructions after j must be stalled

until i is finished and j can execute. For example, consider this code:

DIV.D F0,F2,F4

ADD.D F10,F0,F8

SUB.D F12,F8,F14

The SUB.D instruction cannot execute because the dependence of ADD.D on DIV.D causes the

pipeline to stall; yet, SUB.D is not data dependent on anything in the pipeline. This hazard cre-

ates a performance limitation that can be eliminated by not requiring instructions to execute

in program order.

In the classic five-stage pipeline, both structural and data hazards could be checked during

instruction decode (ID): When an instruction could execute without hazards, it was issued

from ID knowing that all data hazards had been resolved.

To allow us to begin executing the SUB.D in the above example, we must separate the issue

process into two parts: checking for any structural hazards and waiting for the absence of a

data hazard. Thus, we still use in-order instruction issue (i.e., instructions issued in program

order), but we want an instruction to begin execution as soon as its data operands are avail-

able. Such a pipeline does out-of-order execution , which implies out-of-order completion .

Out-of-order execution introduces the possibility of WAR and WAW hazards, which do not

exist in the five-stage integer pipeline and its logical extension to an in-order floating-point

pipeline. Consider the following MIPS floating-point code sequence:

DIV.D F0,F2,F4

ADD.D F6,F0,F8

SUB.D F8,F10,F14

MUL.D F6,F10,F8

There is an antidependence between the ADD.D and the SUB.D , and if the pipeline executes the

SUB.D before the ADD.D (which is waiting for the DIV.D ), it will violate the antidependence, yield-

ing a WAR hazard. Likewise, to avoid violating output dependences, such as the write of F6 by

MUL.D , WAW hazards must be handled. As we will see, both these hazards are avoided by the

use of register renaming.

Out-of-order completion also creates major complications in handling exceptions. Dynamic

scheduling with out-of-order completion must preserve exception behavior in the sense that

Computer Architecture: A Quantitative Approach

Search WWH ::

Custom Search

Home