Hardware Reference
In-Depth Information
sumption after exceptions will find this material useful, since they are key to understanding
the more advanced approaches in Chapter 3 .
Section C.5 discusses how the five-stage pipeline can be extended to handle longer-running
floating-point instructions. Section C.6 puts these concepts together in a case study of a deeply
pipelined processor, the MIPS R4000/4400, including both the eight-stage integer pipeline and
the floating-point pipeline.
Section C.7 introduces the concept of dynamic scheduling and the use of scoreboards to
implement dynamic scheduling. It is introduced as a crosscuting issue, since it can be used
to serve as an introduction to the core concepts in Chapter 3 , which focused on dynamically
scheduled approaches. Section C.7 is also a gentle introduction to the more complex Toma-
sulo's algorithm covered in Chapter 3 . Although Tomasulo's algorithm can be covered and
understood without introducing scoreboarding, the scoreboarding approach is simpler and
easier to comprehend.
What Is Pipelining?
Pipelining is an implementation technique whereby multiple instructions are overlapped in ex-
ecution; it takes advantage of parallelism that exists among the actions needed to execute an
instruction. Today, pipelining is the key implementation technique used to make fast CPUs.
A pipeline is like an assembly line. In an automobile assembly line, there are many steps,
each contributing something to the construction of the car. Each step operates in parallel with
the other steps, although on a different car. In a computer pipeline, each step in the pipeline
completes a part of an instruction. Like the assembly line, different steps are completing dif-
ferent parts of different instructions in parallel. Each of these steps is called a pipe stage or a
pipe segment . The stages are connected one to the next to form a pipe—instructions enter at one
end, progress through the stages, and exit at the other end, just as cars would in an assembly
line.
In an automobile assembly line, throughput is defined as the number of cars per hour and is
determined by how often a completed car exits the assembly line. Likewise, the throughput of
an instruction pipeline is determined by how often an instruction exits the pipeline. Because
the pipe stages are hooked together, all the stages must be ready to proceed at the same time,
just as we would require in an assembly line. The time required between moving an instruc-
tion one step down the pipeline is a processor cycle . Because all stages proceed at the same time,
the length of a processor cycle is determined by the time required for the slowest pipe stage,
just as in an auto assembly line the longest step would determine the time between advancing
the line. In a computer, this processor cycle is usually 1 clock cycle (sometimes it is 2, rarely
more).
The pipeline designer's goal is to balance the length of each pipeline stage, just as the de-
signer of the assembly line tries to balance the time for each step in the process. If the stages are
perfectly balanced, then the time per instruction on the pipelined processor—assuming ideal
conditions—is equal to
Search WWH ::




Custom Search