Optimizing Capacitance and Switching Activity to Reduce Dynamic Power - Computer Architecture Techniques for Power-Efficiency

Information Technology Reference

In-Depth Information

Such a low accuracy for the estimator is disheartening for pipeline gating. Most of the time it

would stall correct execution. However, this holds for a single low-confidence branch.

If more than one low-confidence branch enters the pipeline then the chances of going

down the wrong path increase substantially. In fact, for N low-confidence branches and an

average estimator accuracy of P (for each), the probability of going down the wrong path (i.e.,

having at least one misprediction) becomes: 1

P ) N . Conveniently enough, evidence

shows that low-confidence predictions do tend to cluster together [ 88 ]. Pipeline gating is thus

engaged with more than one low-confidence branch in the pipeline—the actual number is called

gating threshold . This makes the coverage of the estimator (detecting many low-confidence

branches) more important than its accuracy because it is the number of low-confidence branches

in the pipeline that matters—not their accuracy. Manne et al. discuss several possible confidence

estimators for the gshare and the McFarling predictors, including

−

(1

−

perfect (oracle) confidence estimation,

static (profiled) estimation allowing the customization of coverage versus accuracy,

Miss Distance Counter ( MCD ) estimator that independently keeps track of prediction

correctness,

for the McFarling predictor an estimator—called “both strong” —based on the agreement

of the saturating counters of the gshare and bimodal components, and

finally, for the gshare predictor a simple estimator based on the distance of a branch

from the last low-confidence branch.

Estimator details are not of much importance here, but rather the fact that different

estimators can be designed trading coverage and accuracy. Choosing the distance for gshare

and both-strong for McFarling and with a gating threshold of 2, a significant part of incorrect

execution is eliminated without any perceptible impact on performance.

To conclude this approach, one last question that needs to be addressed is the specific

pipeline stage to gate. The earlier the pipeline is gated, the more incorrect work is saved but also

the larger the penalty of stalling correct execution. This is not simply a function of the number of

pipeline stages before gating. The important factor here is the number of incorrect instructions

as we go deeper into the pipeline. Gating at the issue stage hardly saves any extraneous work

since very few incorrect instructions make it that deep in the pipeline. In contrast, the initial

stages of fetching, decoding, etc. can be full of incorrect-path instructions. With a gating

threshold of two or more, the chances of stalling correct execution are miniscule, so it pays to

gate as soon as possible (i.e., at the fetch stage).

Selective throttling : Subsequent work by Aragon, J. Gonzalez and A. Gonzalez followed

a different path. Instead of having a single mechanism to stall execution as in Manne et al.,

Search WWH ::

Custom Search

Home