Hardware Reference
In-Depth Information
a. [12] <2.2> How many bytes wide should each write buffer entry be?
b. [15] <2.2> What speedup could be expected in the steady state by using a merging
write buffer instead of a nonmerging buffer when zeroing memory by the execution
of 64-bit stores if all other instructions could be issued in parallel with the stores and
the blocks are present in the L2 cache?
c. [15] <2.2> What would the effect of possible L1 misses be on the number of required
write buffer entries for systems with blocking and nonblocking caches?
2.13 [10/10/10] <2.3> Consider a desktop system with a processor connected to a 2 GB DRAM
with error-correcting code (ECC) . Assume that there is only one memory channel of width 72
bits to 64 bits for data and 8 bits for ECC.
a. [10] <2.3> How many DRAM chips are on the DIMM if 1 GB DRAM chips are used,
and how many data I/Os must each DRAM have if only one DRAM connects to each
DIMM data pin?
b. [10] <2.3> What burst length is required to support 32 B L2 cache blocks?
c. [10] <2.3> Calculate the peak bandwidth for DDR2-667 and DDR2-533 DIMMs for
reads from an active page excluding the ECC overhead.
2.14 [10/10] <2.3> A sample DDR2 SDRAM timing diagram is shown in Figure 2.31 . tRCD is
the time required to activate a row in a bank, and column address strobe (CAS) latency
(CL) is the number of cycles required to read out a column in a row Assume that the RAM
is on a standard DDR2 DIMM with ECC, having 72 data lines. Also assume burst lengths
of 8 which read out 8 bits, or a total of 64 B from the DIMM. Assume tRCD = CAS (or CL) *
clock_frequency , and clock_frequency = transfers_per_second/2 . The on-chip latency on a cache
miss through levels 1 and 2 and back, not including the DRAM access, is 20 ns.
a. [10] <2.3> How much time is required from presentation of the activate command until
the last requested bit of data from the DRAM transitions from valid to invalid for
the DDR2-667 1 GB CL = 5 DIMM? Assume that for every request we automatically
prefetch another adjacent cacheline in the same page.
b. [10] <2.3> What is the relative latency when using the DDR2-667 DIMM of a read re-
quiring a bank activate versus one to an already open page, including the time re-
quired to process the miss inside the processor?
FIGURE 2.31 DDR2 SDRAM timing diagram .
2.15 [15] <2.3> Assume that a DDR2-667 2 GB DIMM with CL = 5 is available for $130 and a
DDR2-533 2 GB DIMM with CL = 4 is available for $100. Assume that two DIMMs are used
in a system, and the rest of the system costs $800. Consider the performance of the system
 
Search WWH ::




Custom Search