COMPUTER SYSTEMS ORGANIZATION - Structured Computer Organization

Hardware Reference

In-Depth Information

Getting back to our pipeline of Fig. 2-4, suppose that the cycle time of this ma-

chine is 2 nsec. Then it takes 10 nsec for an instruction to progress all the way

through the five-stage pipeline. At first glance, with an instruction taking 10 nsec,

it might appear that the machine can run at 100 MIPS, but in fact it does much bet-

ter than this. At every clock cycle (2 nsec), one new instruction is completed, so

the actual rate of processing is 500 MIPS, not 100 MIPS.

Pipelining allows a trade-off between latency (how long it takes to execute an

instruction), and processor bandwidth (how many MIPS the CPU has). With a

cycle time of T nsec, and n stages in the pipeline, the latency is nT nsec because

each instruction passes through n stages, each of which takes T nsec.

Since one instruction completes every clock cycle and there are 10 9 / T clock

cycles/second, the number of instructions executed per second is 10 9 / T . For ex-

ample, if T

2 nsec, 500 million instructions are executed each second. To get

the number of MIPS, we have to divide the instruction execution rate by 1 million

to get (10 9 / T )/10 6

=

1000/ T MIPS. Theoretically, we could measure instruction

execution rate in BIPS instead of MIPS, but nobody does that, so we will not ei-

ther.

=

Superscalar Architectures

If one pipeline is good, then surely two pipelines are better. One possible de-

sign for a dual pipeline CPU, based on Fig. 2-4, is shown in Fig. 2-5. Here a single

instruction fetch unit fetches pairs of instructions together and puts each one into

its own pipeline, complete with its own ALU for parallel operation. To be able to

run in parallel, the two instructions must not conflict over resource usage (e.g., reg-

isters), and neither must depend on the result of the other. As with a single

pipeline, either the compiler must guarantee this situation to hold (i.e., the hard-

ware does not check and gives incorrect results if the instructions are not compati-

ble), or conflicts must be detected and eliminated during execution using extra

hardware.

S1

S2

S3

S4

S5

Instruction

decode

unit

Operand

fetch

unit

Instruction

execution

unit

Write

back

unit

Instruction

fetch

unit

Instruction

decode

unit

Operand

fetch

unit

Instruction

execution

unit

Write

back

unit

Figure 2-5. Dual five-stage pipelines with a common instruction fetch unit.

Although pipelines, single or double, were originally used on RISC machines

(the 386 and its predecessors did not have any), starting with the 486 Intel began

Structured Computer Organization

Search WWH ::

Custom Search

Home