Optimizing Capacitance and Switching Activity to Reduce Dynamic Power - Computer Architecture Techniques for Power-Efficiency

Information Technology Reference

In-Depth Information

FIGURE 4.34: Micro-Operation Cache (µC) in the P6 architecture. Traces are built as uops are issued

after the decode stage. Uop traces are delivered to the issue stage at the same time as the normal front-end

the uops are not delivered to the issue stage until after 4 more cycles (stages). This ensures that

there is no bubble in the pipeline switching back and forth from streaming uops out of the µ C

to fetching IA-32 instructions from the instruction cache and decoding them.

The benefits for often-repeating traces, of course, are significant. Solomon et al. report

that 75% of all instruction decoding (hence, uop translation) is eliminated using a moderately

sized micro-operation cache (e.g., 64 sets

6 uops/line). This translates to a

10% reduction of the processor's total power for the P6 architecture [ 210 ].

The Pentium-4 trace cache is a prime example of a power-saving technique eliminating

repetitive and cacheable computation (decoding) . But at the same time it is also a cache hierarchy

optimization similarly to the loop cache.

×

6 associativity

×

4.11 SPECULATIVE ACTIVITY

Speculative switching activity is a high-level type of switching activity relating to speculative exe-

cution. Wide superscalar processors need a constant supply of instructions not only to keep mul-

tiple functional units busy when this is feasible, but also to make forward progress in the face of

costly cache misses. Although there is significant instruction level parallelism in many programs,

we have come to a point where it is a struggle to maintain an IPC of 1 at the highest frequencies.

Branch prediction is a necessity in this situation. It provides for more independent instruc-

tions to keep the functional units busy until the next cache miss. However, even sophisticated

branch prediction may not be enough to avoid complete stalls [ 126 ]. Prediction, of course,

leads to speculation: instructions are executed speculatively until the correct execution path is

verified. Besides the actual power consumption overhead of supporting branch prediction and

speculative execution (e.g., prediction structures, support for checkpointing, increased run-time

state, etc.) there is also the issue of incorrect execution. Incorrect speculative execution that is

discarded when the branch is resolved is—for the most part—wasted switching activity. This

Search WWH ::

Custom Search

Home