Information Technology Reference
In-Depth Information
transitions are mostly due to function calls/returns and long distance jumps, which are, of
course, highly predictable. It is a straightforward matter of using a CAM buffer or the cache
tags themselves to keep the correspondence of such points in the code to bank transitions. Upon
encountering a point that indicates a transition to another bank, the target bank is reactivated.
Prediction accuracy for a CAM buffer next sub-bank predictor ranges from 51% for
32 entries to 78% for 256 entries. The resulting performance penalty is small (less than 2%
for the 32-entry buffer and less than 1% for the 256-entry). Things get better if the next bank
prediction is associated with the tags. It is less costly and better performing than a 128-entry
CAM buffer. It is also possible to do static next sub-bank prediction at compile-time or link-
time completely eliminating the overhead of dynamic prediction [ 12 ]. The end result is that
the NSBP drowsy policies in the instruction cache work equally well but with slightly less
performance impact than state-destroying, gated- V dd decay with a fixed decay interval.
Program hotspots and code sequentiality : Similarly to the ideas of Kim et al. at the bank level,
Hu, Nadgir, Vijaykrishnan, Irwin, and Kandemir exploit code behavior but at a finer granularity
[ 104 ]. Their approach is based on identifying the instructions comprising the working set of
executing code. Such instructions are kept active, out of reach of leakage-control policies, until
execution moves to a different working set.
The working set in this case corresponds to a program phase. Typically, program execution
occurs in phases. A program phase is identified by instructions which exhibit high-temporal
locality in the duration of the phase. In general, not all such instructions are spacially close
but can be scattered across the address space. If a program phase persists long enough, it is
considered to be a hotspot [ 104 ].
Whereas the approach of Kim et al. is at the bank level, assuming that a loop body
maps on a cache bank and occasionally makes calls to subroutines mapped on other banks, the
approach of Hu et al. is at a much finer grain: cache lines containing the hotspot instructions
are individually marked as such regardless of where they are in the cache. These lines are then
excluded from leakage control.
Marking the hotspot instructions relies on an application's branch behavior. In particular,
it is accomplished using information from the Branch Target Buffer (BTB). The BTB identifies
the basic blocks that comprise a hotspot by keeping statistics on how often the basic blocks
are executed. For each BTB entry two basic blocks are traced: the basic block that starts at the
target address and the basic block at the fall through address (when the branch is not taken).
Statistics are kept in frequency counters associated with each BTB entry and are collected
during a time window. When a frequency counter exceeds some empirically chosen threshold,
the corresponding basic block is considered hot. All ensuing fetches up until the next BTB
access are tagged as hotspot cache lines.
The leakage-reduction policy is the Flautner et al. Simple policy. At the end of a time
window, all the cache lines are put into drowsy mode except the lines that are tagged as hotspot
Search WWH ::




Custom Search