Information Technology Reference
In-Depth Information
gcc
compress
vortex
80
45
80
70
40
70
35
60
60
30
50
50
25
40
40
20
30
30
15
20
20
10
10
10
5
0
0
0
0
10
20
30
40
50
60
70
80
90
time(x100cycles)
time(x100cycles)
time(x100cycles)
FIGURE 5.5: Inter-access interval distributions for gcc, compress and vortex from the SPEC2000 suite.
Reproduced from [ 127 ]. Copyright 2001 IEEE.
The main task of cache decay is to gauge the idle time of a cache line in relation to its
inter-access times. When the idle time of a cache line exceeds a limit called the decay interval
(which is set to be beyond the cluster of the small inter-access times), the cache line is predicted
to be in its dead time and is shut off using a gated- V dd sleep transistor. 8
Implementation : While there are a few possible implementations for cache decay (includ-
ing some analog varieties), one of the better known methods uses a scheme of hierarchical
counters [ 127 ]. The idea is to use counters in the cache lines to measure their idle time. A
counter works like a stopwatch: it starts ticking after an access; if the cache line is accessed,
it is reset; if, however, it ticks uninterrupted until it reaches the decay interval then the cache
line is pronounced “dead.” It is evident from Figure 5.5 that the idle time, the decay interval,
needed to safely determine entry into the dead time is of the order of several thousand cycles. A
counter maximum in the thousands would incur too much overhead, however, to include with
each and every cache line.
The solution is to use much smaller, coarser-grain counters in the cache lines (Figure 5.6).
These small counters advance every few hundreds (or even thousands) of cycles rather than every
single. The beat is given by a single global cycle counter which counts these larger intervals.
So, for example, if the global counter counts 1024 cycles and the local cache line counters are
2 bits, then they count decay intervals up to 4
1024 cycles. This scheme minimizes overhead,
since the global counter can be easily piggybacked on cycle counters commonly found inside
processor cores and the local cache line counters can be implemented asynchronously—possibly
with an efficient coding such as Gray coding—to minimize switching overhead [ 127 ].
×
8 It is interesting to note that cache decay works similarly to the way most other electronic devices are put into a
sleep mode: by detecting that the device is idle for a period larger than its average interactivity time. Some typical
examples are the hard disks and the laptop displays which are shut down by the operating system after preset
periods of inactivity.
Search WWH ::




Custom Search