Information Technology Reference
In-Depth Information
main characteristics for the three approaches. Equally important to the partitioning technique
is the method for selecting a cache configuration to achieve power or performance goals.
4.8.1 Trading Memory Between Cache Levels
Cache resizing was also proposed in Albonesi's paper on complexity-adaptive structures along
with instruction queue resizing [ 7 ]. Both techniques rely on structures partitioned in segments
using buffered wires. Regarding caches, the whole memory comprising the cache hierarchy is
assumed to be segmented in this manner.
Albonesi's proposal calls for a variable division between the L1 and the L2. This dy-
namic division is based on assigning memory segments to be either in the L1 or in the L2.
Architecturally, the two caches are resized by increasing or decreasing their associativity—not
by changing the number of sets. Thus, cache indices remain the same throughout size changes.
This is necessary to avoid making resident data inaccessible after a change in indexing. Further-
more, cache exclusion is imposed between the L1 and the L2, guaranteeing that data remain
unique regardless of the movable boundary between the two levels. Cache inclusion, on the
other hand, can result in the same data appearing twice in the same cache. This is possible if
two copies of the same data initially residing in the L1 and the L2, respectively, end up in the
same cache after a resizing operation.
The variable boundary between L1 and L2 is intended for performance reasons. Making
the L1 smaller allows for a faster clock (the latency of the cache in cycles does not change), while
making it larger increases its hit ratio. In this initial work, no attempt is made to dynamically
control the configuration of the caches. Instead, all possible configurations are studied, each
persisting throughout the execution of a program.
Although this complexity-adaptive scheme yields performance benefits (depending on
the program and the configuration) no assessment is provided regarding its impact on power
consumption. However, the change in associativity in the L1 and the L2 (magnified by the
difference in the number of accesses between the two caches) can affect power consumption,
despite the fact that total amount of active memory remains constant.
Following the initial proposal for the variable L1/L2 division, Balasubramonian, Al-
bonesi, Buyuktosunoglu, and Dwarkadas take it one step further by proposing a more specific
and more detailed cache organization to achieve the same goal [ 21 ]. More importantly, they
also propose mechanisms to control the configuration of the caches at run-time.
The organization is based on a 2MB physical cache which is partitioned into four distinct
512KB subarrays. Each subarray is further partitioned into four segments with the help of
repeaters in the wordlines. Each of these segments acts as an associative way, either allocated
to the L1 or to the L2. Figure 4.17 shows the organization of the physical cache.
Search WWH ::




Custom Search