Information Technology Reference
In-Depth Information
Multi-MRU
MRU
MMRU uses the log 2 ( N ) least-significant tag bits to select
an MRU table (here 2 bits for 4 tables):
Cache Ways
01 2 3
Cache Ways
01 2 3
tag bits
00
01
10
11
presense
vectors:
mru
0001
*11
*01 *00
*00
Sets
0001
0100
0010
1000
Sets
Sets
Sets
assoc
as s oc
assoc ways
bits
assoc
N MRU tables
way prediction is the MRU of the set(eg. way 3)
Way prediction for tag:
*00 is 3 (note the other “non-MRU” *00 tag in way 2)
*01 is 1 (tag *01 is in its DM position)
*10 is 2 (although no tag *10 is in the set)
*11 is 0
FIGURE 4.29: Multi-MRU way-predictor employs N MRU predictors (typically N
=
assoc )todisam-
biguate on few least-significant tag bits.
Powell et al. report that SDM combined with way prediction yields significant savings by
accessing mostly the direct-mapped or the predicted way. Despite some performance penalty
(less than 3%) due to mispredictions, the reduction in EDP is of the order of 64-69% for the
4-way 16KB instruction L1 and data L1, respectively. For their processor models the overall
reduction in EDP for this technique is 8%, while with perfect prediction is only 2% better
(10%) [ 183 ].
Multi-MRU :The multi-MRU ( MMRU) Zhang et al. proposal [ 242 ] (later also appearing
in Zhu et al. [ 249 ]) is also an extension of the most recently used ( MRU ) way-prediction [ 43 , 48 ].
MRU simply returns the most recently accessed way of a set as its prediction (Figure 4.29, left
diagram) but MMRU allows multiple MRU predictors to disambiguate among tags (Figure 4.29,
right diagram). All tags in a set having the same low-order bits are tracked by the same MRU
table. For example, in Figure 4.29, two tags ending in 00 are tracked by the leftmost MRU
table. The prediction is the cache-way of the MRU tag among them (e.g., way 3 in Figure 4.29).
In theory, MMRU can disambiguate any number of tag bits, but in practice the technique is
limited by the cost of the MRU tables.
It is interesting to note that according to the published results, MMRU is about equal
in predictive power to selective direct-mapping when log 2 (associativity) tag bits (i.e., as many
MRU tables as the associativity of the cache) are used. In terms of predictive power, SDM
aims to place as many lines as it can in their direct-mapped positions and handle the rest
with a way-predictor. MMRU tracks all such lines, both those in their direct-mapped position
and those in set-associative positions, yielding approximately the same prediction accuracy—an
average of 92% first probe hits for 4-way caches [ 183 , 242 , 249 ].
A weakness in all the way prediction techniques mentioned so far is that they do not
do well on misses. MRU, MMRU, and SDM incur the maximum latency and energy just to
 
Search WWH ::




Custom Search