Hardware Reference
In-Depth Information
exactly those exceptions that would arise if the program were executed in strict program order
actually do arise. Dynamically scheduled processors preserve exception behavior by delay-
ing the notification of an associated exception until the processor knows that the instruction
should be the next one completed.
Although exception behavior must be preserved, dynamically scheduled processors could
generate imprecise exceptions. An exception is imprecise if the processor state when an excep-
tion is raised does not look exactly as if the instructions were executed sequentially in strict
program order. Imprecise exceptions can occur because of two possibilities:
1. The pipeline may have already completed instructions that are later in program order than
the instruction causing the exception.
2. The pipeline may have not yet completed some instructions that are earlier in program order
than the instruction causing the exception.
Imprecise exceptions make it difficult to restart execution after an exception. Rather than ad-
dress these problems in this section, we will discuss a solution that provides precise exceptions
in the context of a processor with speculation in Section 3.6 . For floating-point exceptions, oth-
er solutions have been used, as discussed in Appendix J.
To allow out-of-order execution, we essentially split the ID pipe stage of our simple ive-
stage pipeline into two stages:
1. Issue —Decode instructions, check for structural hazards.
2. Read operands —Wait until no data hazards, then read operands.
An instruction fetch stage precedes the issue stage and may fetch either into an instruction re-
gister or into a queue of pending instructions; instructions are then issued from the register or
queue. The execution stage follows the read operands stage, just as in the five-stage pipeline.
Execution may take multiple cycles, depending on the operation.
We distinguish when an instruction begins execution and when it completes execution ; between
the two times, the instruction is in execution . Our pipeline allows multiple instructions to be in
execution at the same time; without this capability, a major advantage of dynamic scheduling
is lost. Having multiple instructions in execution at once requires multiple functional units,
pipelined functional units, or both. Since these two capabilities—pipelined functional units
and multiple functional units—are essentially equivalent for the purposes of pipeline control,
we will assume the processor has multiple functional units.
In a dynamically scheduled pipeline, all instructions pass through the issue stage in order
(in-order issue); however, they can be stalled or bypass each other in the second stage (read
operands) and thus enter execution out of order. Scoreboarding is a technique for allowing in-
structions to execute out of order when there are sufficient resources and no data dependen-
ces; it is named after the CDC 6600 scoreboard, which developed this capability. Here, we fo-
cus on a more sophisticated technique, called Tomasulo's algorithm . The primary difference is
that Tomasulo's algorithm handles antidependences and output dependences by efectively
renaming the registers dynamically. Additionally, Tomasulo's algorithm can be extended to
handle speculation , a technique to reduce the effect of control dependences by predicting the
outcome of a branch, executing instructions at the predicted destination address, and taking
corrective actions when the prediction was wrong. While the use of scoreboarding is probably
sufficient to support a simple two-issue superscalar like the ARM A8, a more aggressive pro-
cessor, like the four-issue Intel i7, benefits from the use of out-of-order execution.
Search WWH ::




Custom Search