Managing Static (Leakage) Power - Computer Architecture Techniques for Power-Efficiency

Information Technology Reference

In-Depth Information

transitions are mostly due to function calls/returns and long distance jumps, which are, of

course, highly predictable. It is a straightforward matter of using a CAM buffer or the cache

tags themselves to keep the correspondence of such points in the code to bank transitions. Upon

encountering a point that indicates a transition to another bank, the target bank is reactivated.

Prediction accuracy for a CAM buffer next sub-bank predictor ranges from 51% for

32 entries to 78% for 256 entries. The resulting performance penalty is small (less than 2%

for the 32-entry buffer and less than 1% for the 256-entry). Things get better if the next bank

prediction is associated with the tags. It is less costly and better performing than a 128-entry

CAM buffer. It is also possible to do static next sub-bank prediction at compile-time or link-

time completely eliminating the overhead of dynamic prediction [ 12 ]. The end result is that

the NSBP drowsy policies in the instruction cache work equally well but with slightly less

performance impact than state-destroying, gated- V dd decay with a fixed decay interval.

Program hotspots and code sequentiality : Similarly to the ideas of Kim et al. at the bank level,

Hu, Nadgir, Vijaykrishnan, Irwin, and Kandemir exploit code behavior but at a finer granularity

[ 104 ]. Their approach is based on identifying the instructions comprising the working set of

executing code. Such instructions are kept active, out of reach of leakage-control policies, until

execution moves to a different working set.

The working set in this case corresponds to a program phase. Typically, program execution

occurs in phases. A program phase is identified by instructions which exhibit high-temporal

locality in the duration of the phase. In general, not all such instructions are spacially close

but can be scattered across the address space. If a program phase persists long enough, it is

considered to be a hotspot [ 104 ].

Whereas the approach of Kim et al. is at the bank level, assuming that a loop body

maps on a cache bank and occasionally makes calls to subroutines mapped on other banks, the

approach of Hu et al. is at a much finer grain: cache lines containing the hotspot instructions

are individually marked as such regardless of where they are in the cache. These lines are then

excluded from leakage control.

Marking the hotspot instructions relies on an application's branch behavior. In particular,

it is accomplished using information from the Branch Target Buffer (BTB). The BTB identifies

the basic blocks that comprise a hotspot by keeping statistics on how often the basic blocks

are executed. For each BTB entry two basic blocks are traced: the basic block that starts at the

target address and the basic block at the fall through address (when the branch is not taken).

Statistics are kept in frequency counters associated with each BTB entry and are collected

during a time window. When a frequency counter exceeds some empirically chosen threshold,

the corresponding basic block is considered hot. All ensuing fetches up until the next BTB

access are tagged as hotspot cache lines.

The leakage-reduction policy is the Flautner et al. Simple policy. At the end of a time

window, all the cache lines are put into drowsy mode except the lines that are tagged as hotspot

Search WWH ::

Custom Search

Home