Information Technology Reference
In-Depth Information
FIGURE 5.13: Drowsy cache. Reproduced from [ 77 ]. Copyright 2002 IEEE.
cache. It first has to be voltage-scaled back to full V dd . Because this is not instantaneous, there
is a penalty, albeit small, in accessing drowsy cells.
High-level policies for drowsy caches : Because state is preserved in drowsy mode, there is no
danger in experiencing long miss latencies when accessing drowsy cache lines. The penalty to
voltage-scale a drowsy cache line back to full V dd is relatively small—a few (single-digit) cycles.
Whereas it would matter significantly which cache lines are put into low-leakage mode in a non-
state-reserving technique, with the drowsy mode it does not matter; mistakes cost very little.
This makes sophisticated techniques that determine the idleness of cache lines unnecessary,
especially if one factors in their dynamic power cost. Flautner et al. thus propose a very simple
policy—fittingly called Simple —for the drowsy mode: the whole cache is periodically put into
drowsy mode—all of the cache lines regardless of usefulness or idleness. The small percentage
of active cache lines are going to exit the drowsy mode, on demand, incurring a small latency
penalty. Since this latency is experienced on hits, programs which are sensitive to hit latency
are going to be hurt the most. A variable hit latency can also complicate instruction scheduling
in an out-of-order core, further degrading the performance [ 180 ].
The simple policy is quite effective: it can put into drowsy mode 80-90% of a 32KB
L1 data cache while incurring a slight performance penalty of 1%. These numbers are for a
four-instruction wide out-of-order core and assume a very aggressive one-cycle penalty for
accessing drowsy cache lines. The Simple policy does not perform as well with instruction
caches which need to be handled differently.
Improvements on the drowsy policy : Petit, Sahuquillo, Such, and Kaeli [ 181 ] improved
on the Simple policy of Flautner et al. by applying few smart heuristics. Their approach is to
maintain the low complexity of the initial idea by adding very little hardware.
The goal is to improve on the Simple policy which blindly puts all cache lines in drowsy
mode. In the Simple policy no effort is spent to distinguish between active (important) and idle
(useless) cache lines. On the other hand, precisely determining the individual status of each
Search WWH ::




Custom Search