Pipelining: Basic and Intermediate Concepts - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

tion occurs and the state must be rolled back earlier than some instruction that completed out

of order, the original value of the register can be restored from the history file. A similar tech-

nique is used for autoincrement and autodecrement addressing on processors such as VAXes.

Another approach, the future file , proposed by Smith and Pleszkun [1988], keeps the newer

value of a register; when all earlier instructions have completed, the main register file is up-

dated from the future file. On an exception, the main register file has the precise values for the

interrupted state. In Chapter 3 , we saw extensions of this idea which are used in processors

such as the PowerPC 620 and the MIPS R10000 to allow overlap and reordering while pre-

serving precise exceptions.

A third technique in use is to allow the exceptions to become somewhat imprecise, but to

keep enough information so that the trap-handling routines can create a precise sequence for

the exception. This means knowing what operations were in the pipeline and their PCs. Then,

after handling the exception, the software finishes any instructions that precede the latest in-

struction completed, and the sequence can restart. Consider the following worst-case code se-

quence:

Instruction 1 —A long-running instruction that eventually interrupts execution.

Instruction 2 , …, Instruction n -1 —A series of instructions that are not completed.

Instruction n —An instruction that is inished.

Given the PCs of all the instructions in the pipeline and the exception return PC, the software

can find the state of instruction 1 and instruction n . Because instruction n has completed, we will

want to restart execution at instruction n +1 . After handling the exception, the software must sim-

ulate the execution of instruction1, …, instruction n −1 . Then we can return from the exception

and restart at instruction n +1 . The complexity of executing these instructions properly by the

handler is the major difficulty of this scheme.

There is an important simplification for simple MIPS-like pipelines: If instruction 2 , …, in-

struction n are all integer instructions, we know that if instruction n has completed then all of in-

struction 2 , …, instruction n −1 have also completed. Thus, only FP operations need to be handled.

To make this scheme tractable, the number of floating-point instructions that can be over-

lapped in execution can be limited. For example, if we only overlap two instructions, then only

the interrupting instruction need be completed by software. This restriction may reduce the

potential throughput if the FP pipelines are deep or if there are a significant number of FP

functional units. This approach is used in the SPARC architecture to allow overlap of loating-

point and integer operations.

The final technique is a hybrid scheme that allows the instruction issue to continue only if it

is certain that all the instructions before the issuing instruction will complete without causing

an exception. This guarantees that when an exception occurs, no instructions after the inter-

rupting one will be completed and all of the instructions before the interrupting one can be

completed. This sometimes means stalling the CPU to maintain precise exceptions. To make

this scheme work, the floating-point functional units must determine if an exception is pos-

sible early in the EX stage (in the first 3 clock cycles in the MIPS pipeline), so as to prevent

further instructions from completing. This scheme is used in the MIPS R2000/3000, the R4000,

and the Intel Pentium. It is discussed further in Appendix J.

Performance Of A MIPS FP Pipeline

The MIPS FP pipeline of Figure C.35 on page C-54 can generate both structural stalls for the

divide unit and stalls for RAW hazards (it also can have WAW hazards, but this rarely occurs

in practice). Figure C.39 shows the number of stall cycles for each type of floating-point op-

Computer Architecture: A Quantitative Approach

Search WWH ::

Custom Search

Home