PARALLEL COMPUTER ARCHITECTURES - Structured Computer Organization

Hardware Reference

In-Depth Information

Snooping Caches

While the performance arguments given above are certainly true, we have

glossed a bit too quickly over a fundamental problem. Suppose that memory is se-

quentially consistent. What happens if CPU 1 has a line in its cache, and then

CPU 2 tries to read a word in the same cache line? In the absence of any special

rules, it, too, would get a copy in its cache. In principle, having the same line

cached twice is acceptable. Now suppose that CPU 1 modifies the line and then,

immediately thereafter, CPU 2 reads its copy of the line from its cache. It will get

stale data , thus violating the contract between the software and memory. The pro-

gram running on CPU 2 will not be happy.

This problem, known in the literature as the cache coherence or cache consis-

tency problem, is extremely serious. Without a solution, caching cannot be used,

and bus-oriented multiprocessors would be limited to two or three CPUs. As a

consequence of its importance, many solutions have been proposed over the years

(e.g., Goodman, 1983, and Papamarcos and Patel, 1984). Although all these cach-

ing algorithms, called cache coherence protocols , differ in the details, all of them

prevent different versions of the same cache line from appearing simultaneously in

two or more caches.

In all solutions, the cache controller is specially designed to allow it to eaves-

drop on the bus, monitoring all bus requests from other CPUs and caches and tak-

ing action in certain cases. These devices are called snooping caches or some-

times snoopy caches because they ''snoop'' on the bus. The set of rules imple-

mented by the caches, CPUs, and memory for preventing different versions of the

data from appearing in multiple caches forms the cache coherence protocol. The

unit of transfer and storage for a cache is called a cache line and is typically 32 or

64 bytes.

The simplest cache coherence protocol is called write through . It can best be

understood by distinguishing the four cases shown in Fig. 8-27. When a CPU tries

to read a word that is not in its cache (i.e., a read miss), its cache controller loads

the line containing that word into the cache. The line is supplied by the memory,

which in this protocol is always up to date. Subsequent reads (i.e., read hits) can

be satisfied out of the cache.

Action

Local request

Remote request

Read miss Fetch data from memory

Read hit Use data from local cache

Write miss Update data in memory

Write hit

Update cache and memory

Invalidate cache entry

Figure 8-27. The write-through cache coherence protocol. The empty boxes in-

dicate that no action is taken.

Structured Computer Organization

Search WWH ::

Custom Search

Home