Hardware Reference
In-Depth Information
CPU chip can be made. Thus the choice comes down to having a small amount of
fast memory or a large amount of slow memory. What we would prefer is a large
amount of fast memory at a low price.
Interestingly enough, techniques are known for combining a small amount of
fast memory with a large amount of slow memory to get the speed of the fast mem-
ory (almost) and the capacity of the large memory at a moderate price. The small,
fast memory is called a cache (from the French cacher , meaning to hide, and pro-
nounced ''cash''). Below we will briefly describe how caches are used and how
they work. A more detailed description will be given in Chap. 4.
The basic idea behind a cache is simple: the most heavily used memory words
are kept in the cache. When the CPU needs a word, it first looks in the cache.
Only if the word is not there does it go to main memory. If a substantial fraction of
the words are in the cache, the average access time can be greatly reduced.
Success or failure thus depends on what fraction of the words are in the cache.
For years, people have known that programs do not access their memories com-
pletely at random. If a given memory reference is to address A , it is likely that the
next memory reference will be in the general vicinity of A . A simple example is
the program itself. Except for branches and procedure calls, instructions are
fetched from consecutive locations in memory. Furthermore, most program execu-
tion time is spent in loops, in which a limited number of instructions are executed
over and over. Similarly, a matrix manipulation program is likely to make many
references to the same matrix before moving on to something else.
The observation that the memory references made in any short time interval
tend to use only a small fraction of the total memory is called the locality princi-
ple and forms the basis for all caching systems. The general idea is that when a
word is referenced, it and some of its neighbors are brought from the large slow
memory into the cache, so that the next time it is used, it can be accessed quickly.
A common arrangement of the CPU, cache, and main memory is illustrated in
Fig. 2-16. If a word is read or written k times in a short interval, the computer will
need 1 reference to slow memory and k
1 references to fast memory. The larger
k is, the better the overall performance.
Main
memory
CPU
Cache
Bus
Figure 2-16. The cache is logically between the CPU and main memory. Physi-
cally, there are several possible places it could be located.
 
Search WWH ::




Custom Search