Hardware Reference
In-Depth Information
and another to data. Separate caches are found in most recent processors, including the Opter-
on. Hence, it has a 64 KB instruction cache as well as the 64 KB data cache.
The processor knows whether it is issuing an instruction address or a data address, so there
can be separate ports for both, thereby doubling the bandwidth between the memory hier-
archy and the processor. Separate caches also offer the opportunity of optimizing each cache
separately: Diferent capacities, block sizes, and associativities may lead to beter performance.
(In contrast to the instruction caches and data caches of the Opteron, the terms unified or mixed
are applied to caches that can contain either instructions or data.)
Figure B.6 shows that instruction caches have lower miss rates than data caches. Separating
instructions and data removes misses due to conflicts between instruction blocks and data
blocks, but the split also fixes the cache space devoted to each type. Which is more important
to miss rates? A fair comparison of separate instruction and data caches to unified caches re-
quires the total cache size to be the same. For example, a separate 16 KB instruction cache and
16 KB data cache should be compared to a 32 KB unified cache. Calculating the average miss
rate with separate instruction and data caches necessitates knowing the percentage of memory
references to each cache. From the data in Appendix A we find the split is 100%/(100% + 26%
+ 10%) or about 74% instruction references to (26% + 10%)/(100% + 26% + 10%) or about 26%
data references. Spliting afects performance beyond what is indicated by the change in miss
rates, as we will see shortly.
FIGURE B.6 Miss per 1000 instructions for instruction, data, and unified caches of dif-
ferent sizes . The percentage of instruction references is about 74%. The data are for two-
way associative caches with 64-byte blocks for the same computer and benchmarks as Figure
B.4 .
B.2 Cache Performance
Because instruction count is independent of the hardware, it is tempting to evaluate processor
performance using that number. Such indirect performance measures have waylaid many a
computer designer. The corresponding temptation for evaluating memory hierarchy perform-
ance is to concentrate on miss rate because it, too, is independent of the speed of the hardware.
As we will see, miss rate can be just as misleading as instruction count. A beter measure of
memory hierarchy performance is the average memory access time :
 
 
Search WWH ::




Custom Search