Hardware Reference
In-Depth Information
Pitfall Extensive Pipelining Can Impact Other Aspects Of A Design, Leading To
Overall Worse Cost-performance.
The best example of this phenomenon comes from two implementations of the VAX, the 8600
and the 8700. When the 8600 was initially delivered, it had a cycle time of 80 ns. Subsequently,
a redesigned version, called the 8650, with a 55 ns clock was introduced. The 8700 has a much
simpler pipeline that operates at the microinstruction level, yielding a smaller CPU with a
faster clock cycle of 45 ns. The overall outcome is that the 8650 has a CPI advantage of about
20%, but the 8700 has a clock rate that is about 20% faster. Thus, the 8700 achieves the same
performance with much less hardware.
Pitfall Evaluating Dynamic Or Static Scheduling On The Basis Of Unoptimized
Code.
Unoptimized code—containing redundant loads, stores, and other operations that might be
eliminated by an optimizer—is much easier to schedule than “tight” optimized code. This
holds for scheduling both control delays (with delayed branches) and delays arising from
RAW hazards. In gcc running on an R3000, which has a pipeline almost identical to that of Sec-
tion C.1 , the frequency of idle clock cycles increases by 18% from the unoptimized and sched-
uled code to the optimized and scheduled code. Of course, the optimized program is much
faster, since it has fewer instructions. To fairly evaluate a compile-time scheduler or runtime
dynamic scheduling, you must use optimized code, since in the real system you will derive
good performance from other optimizations in addition to scheduling.
C.9 Concluding Remarks
At the beginning of the 1980s, pipelining was a technique reserved primarily for supercom-
puters and large multimillion dollar mainframes. By the mid-1980s, the first pipelined micro-
processors appeared and helped transform the world of computing, allowing microprocessors
to bypass minicomputers in performance and eventually to take on and outperform main-
frames. By the early 1990s, high-end embedded microprocessors embraced pipelining, and
desktops were headed toward the use of the sophisticated dynamically scheduled, multiple-
issue approaches discussed in Chapter 3 . The material in this appendix, which was considered
reasonably advanced for graduate students when this text first appeared in 1990, is now con-
sidered basic undergraduate material and can be found in processors costing less than $2!
C.10 Historical Perspective and References
Section L.5 (available online) features a discussion on the development of pipelining and
instruction-level parallelism covering both this appendix and the material in Chapter 3 . We
provide numerous references for further reading and exploration of these topics.
Updated Exercises by Diana Franklin
C.1 [15/15/15/15/25/10/15] <A.2> Use the following code fragment:
Loop:
LD
R1,0(R2)
;load R1 from address 0+R2
Search WWH ::




Custom Search