Memory Hierarchy Design - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

Figures B.8 and B.9 on pages B-24 and B-25 show the relative frequency of cache misses broken

down by the three Cs. As we will see in Chapters 3 and 5 , multithreading and multiple cores

add complications for caches, both increasing the potential for capacity misses as well as

adding a fourth C, for coherency misses due to cache flushes to keep multiple caches coherent

in a multiprocessor; we will consider these issues in Chapter 5 .

Alas, miss rate can be a misleading measure for several reasons. Hence, some designers

prefer measuring misses per instruction rather than misses per memory reference (miss rate).

These two are related:

(It is often reported as misses per 1000 instructions to use integers instead of fractions.)

The problem with both measures is that they don't factor in the cost of a miss. A beter meas-

ure is the average memory access time :

where hit time is the time to hit in the cache and miss penalty is the time to replace the block

from memory (that is, the cost of a miss). Average memory access time is still an indirect meas-

ure of performance; although it is a beter measure than miss rate, it is not a substitute for

execution time. In Chapter 3 we will see that speculative processors may execute other instruc-

tions during a miss, thereby reducing the effective miss penalty. The use of multithreading

(introduced in Chapter 3 ) also allows a processor to tolerate missses without being forced to

idle. As we will examine shortly, to take advantage of such latency tolerating techniques we

need caches that can service requests while handling an outstanding miss.

If this material is new to you, or if this quick review moves too quickly, see Appendix B . It

covers the same introductory material in more depth and includes examples of caches from

real computers and quantitative evaluations of their efectiveness.

Section B.3 in Appendix B presents six basic cache optimizations, which we quickly review

here. The appendix also gives quantitative examples of the benefits of these optimizations. We

also comment briefly on the power implications of these trade-offs.1.

1. Larger block size to reduce miss rate —The simplest way to reduce the miss rate is to take ad-

vantage of spatial locality and increase the block size. Larger blocks reduce compulsory

misses, but they also increase the miss penalty. Because larger blocks lower the number

of tags, they can slightly reduce static power. Larger block sizes can also increase capacity

or conflict misses, especially in smaller caches. Choosing the right block size is a complex

trade-of that depends on the size of cache and the miss penalty.

2. Bigger caches to reduce miss rate —The obvious way to reduce capacity misses is to increase

cache capacity. Drawbacks include potentially longer hit time of the larger cache memory

and higher cost and power. Larger caches increase both static and dynamic power.

3. Higher associativity to reduce miss rate —Obviously, increasing associativity reduces conflict

misses. Greater associativity can come at the cost of increased hit time. As we will see

shortly, associativity also increases power consumption.

Search WWH ::

Custom Search

Home