Memory Hierarchy Design - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

cern of the cache, while main memory bandwidth is the primary concern of multiprocessors

and I/O.

Although caches benefit from low-latency memory, it is generally easier to improve

memory bandwidth with new organizations than it is to reduce latency. The popularity of

multilevel caches and their larger block sizes make main memory bandwidth important to

caches as well. In fact, cache designers increase block size to take advantage of the high

memory bandwidth.

The previous sections describe what can be done with cache organization to reduce this pro-

cessor-DRAM performance gap, but simply making caches larger or adding more levels of

caches cannot eliminate the gap. Innovations in main memory are needed as well.

In the past, the innovation was how to organize the many DRAM chips that made up the

main memory, such as multiple memory banks. Higher bandwidth is available using memory

banks, by making memory and its bus wider, or by doing both. Ironically, as capacity per

memory chip increases, there are fewer chips in the same-sized memory system, reducing pos-

sibilities for wider memory systems with the same capacity.

To allow memory systems to keep up with the bandwidth demands of modern processors,

memory innovations started happening inside the DRAM chips themselves. This section de-

scribes the technology inside the memory chips and those innovative, internal organizations.

Before describing the technologies and options, let's go over the performance metrics.

With the introduction of burst transfer memories, now widely used in both Flash and

DRAM, memory latency is quoted using two measures— access time and cycle time. Access

time is the time between when a read is requested and when the desired word arrives, and

cycle time is the minimum time between unrelated requests to memory.

Virtually all computers since 1975 have used DRAMs for main memory and SRAMs for

cache, with one to three levels integrated onto the processor chip with the CPU. In PMDs,

the memory technology often balances power and speed, with higher end systems using fast,

high-bandwidth memory technology.

SRAM Technology

The irst leter of SRAM stands for static . The dynamic nature of the circuits in DRAM requires

data to be writen back after being read—hence the diference between the access time and the

cycle time as well as the need to refresh. SRAMs don't need to refresh, so the access time is

very close to the cycle time. SRAMs typically use six transistors per bit to prevent the inform-

ation from being disturbed when read. SRAM needs only minimal power to retain the charge

in standby mode.

In earlier times, most desktop and server systems used SRAM chips for their primary, sec-

ondary, or tertiary caches; today, all three levels of caches are integrated onto the processor

chip. Currently, the largest on-chip, third-level caches are 12 MB, while the memory system

for such a processor is likely to have 4 to 16 GB of DRAM. The access times for large, third-

level, on-chip caches are typically two to four times that of a second-level cache, which is still

three to five times faster than accessing DRAM memory.

DRAM Technology

As early DRAMs grew in capacity, the cost of a package with all the necessary address lines

was an issue. The solution was to multiplex the address lines, thereby cuting the number of

address pins in half. Figure 2.12 shows the basic DRAM organization. One-half of the address

is sent first during the row access strobe (RAS). The other half of the address, sent during the

Computer Architecture: A Quantitative Approach

Search WWH ::

Custom Search

Home