Hardware Reference
In-Depth Information
cern of the cache, while main memory bandwidth is the primary concern of multiprocessors
and I/O.
Although caches benefit from low-latency memory, it is generally easier to improve
memory bandwidth with new organizations than it is to reduce latency. The popularity of
multilevel caches and their larger block sizes make main memory bandwidth important to
caches as well. In fact, cache designers increase block size to take advantage of the high
memory bandwidth.
The previous sections describe what can be done with cache organization to reduce this pro-
cessor-DRAM performance gap, but simply making caches larger or adding more levels of
caches cannot eliminate the gap. Innovations in main memory are needed as well.
In the past, the innovation was how to organize the many DRAM chips that made up the
main memory, such as multiple memory banks. Higher bandwidth is available using memory
banks, by making memory and its bus wider, or by doing both. Ironically, as capacity per
memory chip increases, there are fewer chips in the same-sized memory system, reducing pos-
sibilities for wider memory systems with the same capacity.
To allow memory systems to keep up with the bandwidth demands of modern processors,
memory innovations started happening inside the DRAM chips themselves. This section de-
scribes the technology inside the memory chips and those innovative, internal organizations.
Before describing the technologies and options, let's go over the performance metrics.
With the introduction of burst transfer memories, now widely used in both Flash and
DRAM, memory latency is quoted using two measures— access time and cycle time. Access
time is the time between when a read is requested and when the desired word arrives, and
cycle time is the minimum time between unrelated requests to memory.
Virtually all computers since 1975 have used DRAMs for main memory and SRAMs for
cache, with one to three levels integrated onto the processor chip with the CPU. In PMDs,
the memory technology often balances power and speed, with higher end systems using fast,
high-bandwidth memory technology.
SRAM Technology
The irst leter of SRAM stands for static . The dynamic nature of the circuits in DRAM requires
data to be writen back after being read—hence the diference between the access time and the
cycle time as well as the need to refresh. SRAMs don't need to refresh, so the access time is
very close to the cycle time. SRAMs typically use six transistors per bit to prevent the inform-
ation from being disturbed when read. SRAM needs only minimal power to retain the charge
in standby mode.
In earlier times, most desktop and server systems used SRAM chips for their primary, sec-
ondary, or tertiary caches; today, all three levels of caches are integrated onto the processor
chip. Currently, the largest on-chip, third-level caches are 12 MB, while the memory system
for such a processor is likely to have 4 to 16 GB of DRAM. The access times for large, third-
level, on-chip caches are typically two to four times that of a second-level cache, which is still
three to five times faster than accessing DRAM memory.
DRAM Technology
As early DRAMs grew in capacity, the cost of a package with all the necessary address lines
was an issue. The solution was to multiplex the address lines, thereby cuting the number of
address pins in half. Figure 2.12 shows the basic DRAM organization. One-half of the address
is sent first during the row access strobe (RAS). The other half of the address, sent during the
Search WWH ::




Custom Search