COMPUTER SYSTEMS ORGANIZATION - Structured Computer Organization

Hardware Reference

In-Depth Information

CPU chip can be made. Thus the choice comes down to having a small amount of

fast memory or a large amount of slow memory. What we would prefer is a large

amount of fast memory at a low price.

Interestingly enough, techniques are known for combining a small amount of

fast memory with a large amount of slow memory to get the speed of the fast mem-

ory (almost) and the capacity of the large memory at a moderate price. The small,

fast memory is called a cache (from the French cacher , meaning to hide, and pro-

nounced ''cash''). Below we will briefly describe how caches are used and how

they work. A more detailed description will be given in Chap. 4.

The basic idea behind a cache is simple: the most heavily used memory words

are kept in the cache. When the CPU needs a word, it first looks in the cache.

Only if the word is not there does it go to main memory. If a substantial fraction of

the words are in the cache, the average access time can be greatly reduced.

Success or failure thus depends on what fraction of the words are in the cache.

For years, people have known that programs do not access their memories com-

pletely at random. If a given memory reference is to address A , it is likely that the

next memory reference will be in the general vicinity of A . A simple example is

the program itself. Except for branches and procedure calls, instructions are

fetched from consecutive locations in memory. Furthermore, most program execu-

tion time is spent in loops, in which a limited number of instructions are executed

over and over. Similarly, a matrix manipulation program is likely to make many

references to the same matrix before moving on to something else.

The observation that the memory references made in any short time interval

tend to use only a small fraction of the total memory is called the locality princi-

ple and forms the basis for all caching systems. The general idea is that when a

word is referenced, it and some of its neighbors are brought from the large slow

memory into the cache, so that the next time it is used, it can be accessed quickly.

A common arrangement of the CPU, cache, and main memory is illustrated in

Fig. 2-16. If a word is read or written k times in a short interval, the computer will

need 1 reference to slow memory and k

−

1 references to fast memory. The larger

k is, the better the overall performance.

Main

memory

CPU

Cache

Bus

Figure 2-16. The cache is logically between the CPU and main memory. Physi-

cally, there are several possible places it could be located.

Search WWH ::

Custom Search

Home