Managing Static (Leakage) Power - Computer Architecture Techniques for Power-Efficiency

Information Technology Reference

In-Depth Information

lines . The protection of these lines from the leakage-control policy is immediately revoked and

has to be re-established in the current window anew. In addition, BTB frequency counters are

halved (by a 1-bit shift) with the end of each window to allow the “hotspot” working set to

gradually change. A new time window can start sooner than its preset time interval if a loop

is detected. In this case, there is no need to wait till the end of a full time interval to detect

additional hotspot instructions.

Similar to the next-bank prediction of Kim et al., which tries to hide the re-activation

latency, Hu et al. also propose—at a much finer granularity— just-in-time activation of individual

cache lines. Since their proposal works at the cache line level, a simple sequential activation

mechanism, that activates the succeeding cache line (succeeding index) from the one that

is accessed, takes care of the straight-line code. However, way prediction is needed in set-

associative caches to avoid waking up a whole set [ 104 ].

The most sophisticated scheme proposed by Hu et al. employing hot-spot detection, just

in time cache-line activation, and bank activation to detect spatial changes in the working set,

outperforms the coarse-grain technique of Kim et al. at the bank level, as well as a compiler

approach discussed in Section 5.3.6. This is not surprising since at the cache line level there

is potential for much better energy savings while maintaining the full performance advantage.

The proposed scheme results in a 63% reduction in EDP over the unoptimized base case, 48%

reduction over the bank-level technique, and 38% over the compiler-level technique.

5.3.3 State Preserving versus No-state Preserving

The proposal for the drowsy cache was put forth to address the main weakness of the gated-

V dd mechanism used in decay policies. In contrast to gated- V dd , the drowsy mode preserves

the state of the cache lines and results in a much smaller penalty when accessing deactivated,

drowsy, lines. However, it is not without disadvantages: it does not save as much leakage as

completely cutting off the power supply to the cache lines, and reduces reliability by making

the memory cells more susceptible to soft errors. These two characteristics make for interesting

comparisons between the two approaches and even more interesting hybrid schemes employing

both approaches.

Decay versus drowsy : Parikh, Zhang, Sankaranarayanan, Skadron, and Stan examined

energy savings for L1 data caches for the drowsy and cache decay mechanisms [ 178 ]. Their

work shows that non-state-preserving techniques can outperform state preserving ones under

certain conditions. More specifically, for fast L2 caches (5-8 cycle latency), cache decay in the

L1 is better in terms of both performance and energy savings than a drowsy L1.

For the drowsy cache, Parikh et al. abandon the Simple policy of periodically putting

all cache lines in drowsy mode, in favor of the more sophisticated decay policy based on

the generational behavior of cache lines. The drowsy cache is therefore a decaying cache but

Search WWH ::

Custom Search

Home