Information Technology Reference
In-Depth Information
of the original bit-line. On the other hand, a new wire (the global bit-line) and a number
of bypass switches are introduced in the design. The capacitive load on the global bit-line,
however, is so much less than that of the original bit-line (only one bypass switch per
segment as opposed to a pass transistor per cell) that smaller prechargers/drivers and smaller
sense amps can be used. The end result is a net benefit in the power expended to operate
the combined system [ 125 , 83 ].
4.8.5 Further Reading on Cache Reconfiguration
Ranganathan, Adve, and Jouppi proposed reconfigurable caches that can be divided into mul-
tiple partitions [ 189 ]. Their cache partitioning scheme is similar to the selective cache ways in
that it works at the granularity of cache ways. In contrast to the selective cache ways, which
only allows two partitions (an enabled and a disabled partition), this proposal allows multiple
partitions to be created; up to four partitions can be created in a 4-way set associative cache.
Furthermore, cache partitions can be devoted to different functions rather than just being en-
abled or disabled. The example described in the paper uses one partition as an instruction-reuse
cache, i.e., to cache the outcome of frequently appearing instructions. Supporting diverse func-
tionality requires additional address and data busses to accommodate simultaneous access to
all the possible partitions. The proposal is focused on performance rather than power and the
authors acknowledge that some of their design decisions may actually increase power consump-
tion. However, it is closely related to the low-power proposals discussed above, often resorting
to similar solutions for problems such as data accessibility among partitions, replacement, etc.
4.9 PARALLEL SWITCHING-ACTIVITY
IN SET-ASSOCIATIVE CACHES
Besides cache resizing which relates to cache capacity, one can attempt to optimize switching
on the basis of individual cache accesses (for a fixed capacity). Invariably, the effort to reduce
switching activity for an access centers on set-associative or fully associative organizations. There
is not much opportunity in reducing switching activity in a straightforward direct-mapped orga-
nization, but the prospects for optimizing a naively-designed associative cache are ample: in its
power-challenged incarnation the associative cache consumes power linearly to its associativity.
The parallel search in an associative cache is a prime example of parallel switching activity
purposed for performance. While it is known beforehand that all but one of the associative
ways will fail to produce a hit, all ways are still accessed in parallel for speed.
Figure 4.23 depicts a simplified block diagram of a 4-way set associative cache. Tag
and data arrays are shown for four ways. A comparator compares the tags and drives the
multiplexor for the data output. Of course, a real implementation could be markedly different
in how the tags and data arrays are combined or divided in sub-banks (e.g., the way CACTI
Search WWH ::




Custom Search