Hardware Reference
In-Depth Information
The advantage of a tournament predictor is its ability to select the right predictor for a par-
ticular branch, which is particularly crucial for the integer benchmarks. A typical tournament
predictor will select the global predictor almost 40% of the time for the SPEC integer bench-
marks and less than 15% of the time for the SPEC FP benchmarks. In addition to the Alpha
processors that pioneered tournament predictors, recent AMD processors, including both the
Opteron and Phenom, have used tournament-style predictors.
Figure 3.4 looks at the performance of three different predictors (a local 2-bit predictor, a
correlating predictor, and a tournament predictor) for different numbers of bits using SPEC89
as the benchmark. As we saw earlier, the prediction capability of the local predictor does not
improve beyond a certain size. The correlating predictor shows a significant improvement,
and the tournament predictor generates slightly beter performance. For more recent versions
of the SPEC, the results would be similar, but the asymptotic behavior would not be reached
until slightly larger predictor sizes.
FIGURE 3.4 The misprediction rate for three different predictors on SPEC89 as the
total number of bits is increased . The predictors are a local 2-bit predictor, a correlating
predictor that is optimally structured in its use of global and local information at each point in
the graph, and a tournament predictor. Although these data are for an older version of SPEC,
data for more recent SPEC benchmarks would show similar behavior, perhaps converging to
the asymptotic limit at slightly larger predictor sizes.
The local predictor consists of a two-level predictor. The top level is a local history table
consisting of 1024 10-bit entries; each 10-bit entry corresponds to the most recent 10 branch
outcomes for the entry. That is, if the branch was taken 10 or more times in a row, the entry in
the local history table will be all 1s. If the branch is alternately taken and untaken, the history
entry consists of alternating 0s and 1s. This 10-bit history allows paterns of up to 10 branches
to be discovered and predicted. The selected entry from the local history table is used to index
 
Search WWH ::




Custom Search