Review of Memory Hierarchy - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

The simplest way out of this dilemma is for the read miss to wait until the write buffer is

empty. The alternative is to check the contents of the write buffer on a read miss, and if there

are no conflicts and the memory system is available, let the read miss continue. Virtually all

desktop and server processors use the later approach, giving reads priority over writes.

The cost of writes by the processor in a write-back cache can also be reduced. Suppose a

read miss will replace a dirty memory block. Instead of writing the dirty block to memory, and

then reading memory, we could copy the dirty block to a buffer, then read memory, and then

write memory. This way the processor read, for which the processor is probably waiting, will

inish sooner. Similar to the previous situation, if a read miss occurs, the processor can either

stall until the buffer is empty or check the addresses of the words in the buffer for conflicts

Now that we have five optimizations that reduce cache miss penalties or miss rates, it is time

to look at reducing the final component of average memory access time. Hit time is critical be-

cause it can affect the clock rate of the processor; in many processors today the cache access

time limits the clock cycle rate, even for processors that take multiple clock cycles to access the

cache. Hence, a fast hit time is multiplied in importance beyond the average memory access

time formula because it helps everything.

Sixth Optimization: Avoiding Address Translation During

Indexing Of The Cache To Reduce Hit Time

Even a small and simple cache must cope with the translation of a virtual address from the

processor to a physical address to access memory. As described in Section B.4 , processors treat

main memory as just another level of the memory hierarchy, and thus the address of the vir-

tual memory that exists on disk must be mapped onto the main memory.

The guideline of making the common case fast suggests that we use virtual addresses for the

cache, since hits are much more common than misses. Such caches are termed virtual caches ,

with physical cache used to identify the traditional cache that uses physical addresses. As we

will shortly see, it is important to distinguish two tasks: indexing the cache and comparing ad-

dresses. Thus, the issues are whether a virtual or physical address is used to index the cache

and whether a virtual or physical address is used in the tag comparison. Full virtual address-

ing for both indices and tags eliminates address translation time from a cache hit. Then why

doesn't everyone build virtually addressed caches?

One reason is protection. Page-level protection is checked as part of the virtual to physical

address translation, and it must be enforced no mater what. One solution is to copy the pro-

tection information from the TLB on a miss, add a field to hold it, and check it on every access

to the virtually addressed cache.

Another reason is that every time a process is switched, the virtual addresses refer to dif-

ferent physical addresses, requiring the cache to be flushed. Figure B.16 shows the impact on

miss rates of this flushing. One solution is to increase the width of the cache address tag with a

process-identiier tag (PID). If the operating system assigns these tags to processes, it only need

lush the cache when a PID is recycled; that is, the PID distinguishes whether or not the data

in the cache are for this program. Figure B.16 shows the improvement in miss rates by using

PIDs to avoid cache flushes.

Search WWH ::

Custom Search

Home