Information Technology Reference
In-Depth Information
lines . The protection of these lines from the leakage-control policy is immediately revoked and
has to be re-established in the current window anew. In addition, BTB frequency counters are
halved (by a 1-bit shift) with the end of each window to allow the “hotspot” working set to
gradually change. A new time window can start sooner than its preset time interval if a loop
is detected. In this case, there is no need to wait till the end of a full time interval to detect
additional hotspot instructions.
Similar to the next-bank prediction of Kim et al., which tries to hide the re-activation
latency, Hu et al. also propose—at a much finer granularity— just-in-time activation of individual
cache lines. Since their proposal works at the cache line level, a simple sequential activation
mechanism, that activates the succeeding cache line (succeeding index) from the one that
is accessed, takes care of the straight-line code. However, way prediction is needed in set-
associative caches to avoid waking up a whole set [ 104 ].
The most sophisticated scheme proposed by Hu et al. employing hot-spot detection, just
in time cache-line activation, and bank activation to detect spatial changes in the working set,
outperforms the coarse-grain technique of Kim et al. at the bank level, as well as a compiler
approach discussed in Section 5.3.6. This is not surprising since at the cache line level there
is potential for much better energy savings while maintaining the full performance advantage.
The proposed scheme results in a 63% reduction in EDP over the unoptimized base case, 48%
reduction over the bank-level technique, and 38% over the compiler-level technique.
5.3.3 State Preserving versus No-state Preserving
The proposal for the drowsy cache was put forth to address the main weakness of the gated-
V dd mechanism used in decay policies. In contrast to gated- V dd , the drowsy mode preserves
the state of the cache lines and results in a much smaller penalty when accessing deactivated,
drowsy, lines. However, it is not without disadvantages: it does not save as much leakage as
completely cutting off the power supply to the cache lines, and reduces reliability by making
the memory cells more susceptible to soft errors. These two characteristics make for interesting
comparisons between the two approaches and even more interesting hybrid schemes employing
both approaches.
Decay versus drowsy : Parikh, Zhang, Sankaranarayanan, Skadron, and Stan examined
energy savings for L1 data caches for the drowsy and cache decay mechanisms [ 178 ]. Their
work shows that non-state-preserving techniques can outperform state preserving ones under
certain conditions. More specifically, for fast L2 caches (5-8 cycle latency), cache decay in the
L1 is better in terms of both performance and energy savings than a drowsy L1.
For the drowsy cache, Parikh et al. abandon the Simple policy of periodically putting
all cache lines in drowsy mode, in favor of the more sophisticated decay policy based on
the generational behavior of cache lines. The drowsy cache is therefore a decaying cache but
Search WWH ::




Custom Search