Hardware Reference
In-Depth Information
Dynamic Scheduling Using Tomasulo's Approach
The IBM 360/91 floating-point unit used a sophisticated scheme to allow out-of-order execu-
tion. This scheme, invented by Robert Tomasulo, tracks when operands for instructions are
available to minimize RAW hazards and introduces register renaming in hardware to minim-
ize WAW and WAR hazards. There are many variations on this scheme in modern processors,
although the key concepts of tracking instruction dependences to allow execution as soon as
operands are available and renaming registers to avoid WAR and WAW hazards are common
characteristics.
IBM's goal was to achieve high floating-point performance from an instruction set and from
compilers designed for the entire 360 computer family, rather than from specialized compilers
for the high-end processors. The 360 architecture had only four double-precision loating-
point registers, which limits the effectiveness of compiler scheduling; this fact was another
motivation for the Tomasulo approach. In addition, the IBM 360/91 had long memory accesses
and long floating-point delays, which Tomasulo's algorithm was designed to overcome. At the
end of the section, we will see that Tomasulo's algorithm can also support the overlapped ex-
ecution of multiple iterations of a loop.
We explain the algorithm, which focuses on the floating-point unit and load-store unit, in
the context of the MIPS instruction set. The primary difference between MIPS and the 360 is
the presence of register-memory instructions in the later architecture. Because Tomasulo's al-
gorithm uses a load functional unit, no significant changes are needed to add register-memory
addressing modes. The IBM 360/91 also had pipelined functional units, rather than multiple
functional units, but we describe the algorithm as if there were multiple functional units. It is
a simple conceptual extension to also pipeline those functional units.
As we will see, RAW hazards are avoided by executing an instruction only when its oper-
ands are available, which is exactly what the simpler scoreboarding approach provides. WAR
and WAW hazards, which arise from name dependences, are eliminated by register renam-
ing. Register renaming eliminates these hazards by renaming all destination registers, includ-
ing those with a pending read or write for an earlier instruction, so that the out-of-order write
does not affect any instructions that depend on an earlier value of an operand.
To beter understand how register renaming eliminates WAR and WAW hazards, consider
the following example code sequence that includes potential WAR and WAW hazards:
DIV.D F0,F2,F4
ADD.D F6,F0,F8
S.D F6,0(R1)
SUB.D F8,F10,F14
MUL.D F6,F10,F8
There are two antidependences: between the ADD.D and the SUB.D and between the S.D and the
MUL.D . There is also an output dependence between the ADD.D and the MUL.D , leading to three pos-
sible hazards: WAR hazards on the use of F8 by ADD.D and the use of F6 by the SUB.D , as well as
a WAW hazard since the ADD.D may finish later than the MUL.D . There are also three true data
dependences: between the DIV.D and the ADD.D , between the SUB.D and the MUL.D , and between the
ADD.D and the S.D .
These three name dependences can all be eliminated by register renaming. For simplicity,
assume the existence of two temporary registers, S and T . Using S and T , the sequence can be
rewritten without any dependences as:
DIV.D
F0,F2,F4
ADD.D
S,F0,F8
S.D
S,0(R1)
Search WWH ::




Custom Search