Information Technology Reference
In-Depth Information
a hit takes place on a line in MRU position i
the corresponding counter MRU[ i ]is
incremented.
These statistics are important because hits in various MRU positions correspond to hits
in different cache configurations. Hits in the first MRU position correspond to hits in a direct-
mapped cache; the combined hits in the first and second MRU position correspond to hits in
a two-way set-associative cache; and so on. Thus, hits in any configuration of the primary and
secondary groups can be derived simply by summing up hits in the appropriate MRU positions.
This leads to one-shot configuration by allowing one to assess in one go all possible outcomes
and select the “best” configuration. In contrast, a configuration search would have to try each
and every configuration for an entire interval and then make a decision.
Here's how one-shot configuration is done in more detail. Statistics are gathered in
intervals of 100 000 instructions. Since the statistics are independent of the cache configuration
in the interval, they can be used to try “what if” scenarios for any cache configuration. Assuming
that the statistics of an interval are a good indication for the behavior of the next, the most
appropriate configuration for the next window can be thus uncovered.
The “what if” scenarios use simple memory access latency and energy cost models. These
models calculate the effective memory latency and the energy of a configuration as a function
of the hits in its primary and secondary groups. The calculations are performed in a software
interrupt handler which also decides on the next configuration.
The policy to decide the next configuration is to go for the lowest energy consumption
given a limit in the tolerated performance loss (called tolerance level ). This sounds similar to the
policy used in the selective cache ways, but goes further. It has memory. It keeps an account
of what happens in each interval and builds credit or debit for both performance and energy.
So, for example, if previous configurations had better performance than the corresponding
estimates indicated, the policy becomes more aggressive in trying to reduce energy since it has
performance credit . On the contrary, if a performance deficit from previous configurations was
accumulated, the policy has to make up for it, giving up on energy reduction.
This accounting scheme is a result of the one-shot configuration relying on an estimate
on what happens in the upcoming interval. This estimate relies, in turn, on the assumption
that the measured statistics do not differ noticeably from interval to interval. But in reality
they do differ. Accounting normalizes the differences between the estimated and the actual by
employing credit or debit in the next configuration decision.
The accounting cache yields very good power results with a rather small impact on
performance. As Figure 4.20 shows, for tolerance settings of 1/64, 1/16, and 1/4 (1.5, 6.2,
and 0.25 in the graph), energy savings range from 54% to 58% for the instruction L1, 29% to
45% for the data L1, and 25 to 63% for a unified L2 with parallel tag/data access. Overall, for
Search WWH ::




Custom Search