Hardware Reference
In-Depth Information
each cache size.
a. [12] <2.6> What is the system page size?
b. [15] <2.6> How many entries are there in the translation lookaside buffer (TLB) ?
c. [15] <2.6> What is the miss penalty for the TLB?
d. [20] <2.6> What is the associativity of the TLB?
2.6 [20/20] <2.6> In multiprocessor memory systems, lower levels of the memory hierarchy
may not be able to be saturated by a single processor but should be able to be saturated
by multiple processors working together. Modify the code in Figure 2.29 , and run multiple
copies at the same time. Can you determine:
a. [20] <2.6> How many actual processors are in your computer system and how many
system processors are just additional multithreaded contexts?
b. [20] <2.6> How many memory controllers does your system have?
2.7 [20] <2.6> Can you think of a way to test some of the characteristics of an instruction cache
using a program? Hint : The compiler may generate a large number of non obvious instruc-
tions from a piece of code. Try to use simple arithmetic instructions of known length in
your instruction set architecture (ISA).
Exercises
2.8 [12/12/15] <2.2> The following questions investigate the impact of small and simple caches
using CACTI and assume a 65 nm (0.065 μm) technology. (CACTI is available in an online
form at htp://quid.hpl.hp.com:9081/cacti/ . )
a. [12] <2.2> Compare the access times of 64 KB caches with 64 byte blocks and a single
bank. What are the relative access times of two-way and four-way set associative
caches in comparison to a direct mapped organization?
b. [12] <2.2> Compare the access times of four-way set associative caches with 64 byte
blocks and a single bank. What are the relative access times of 32 KB and 64 KB caches
in comparison to a 16 KB cache?
c. [15] <2.2> For a 64 KB cache, find the cache associativity between 1 and 8 with the low-
est average memory access time given that misses per instruction for a certain work-
load suite is 0.00664 for direct mapped, 0.00366 for two-way set associative, 0.000987
for four-way set associative, and 0.000266 for eight-way set associative cache. Over-
all, there are 0.3 data references per instruction. Assume cache misses take 10 ns in
all models. To calculate the hit time in cycles, assume the cycle time output using
CACTI, which corresponds to the maximum frequency a cache can operate without
any bubbles in the pipeline.
2.9 [12/15/15/10] <2.2> You are investigating the possible benefits of a way-predicting L1
cache. Assume that a 64 KB four-way set associative single-banked L1 data cache is the
cycle time limiter in a system. As an alternative cache organization you are considering a
way-predicted cache modeled as a 64 KB direct-mapped cache with 80% prediction accur-
acy. Unless stated otherwise, assume that a mispredicted way access that hits in the cache
takes one more cycle. Assume the miss rates and the miss penalties in question 2.8 part (c).
a. [12] <2.2> What is the average memory access time of the current cache (in cycles)
versus the way-predicted cache?
Search WWH ::




Custom Search