Hardware Reference
In-Depth Information
prediction is a hint that is assumed to be correct, and fetching begins in the predicted direc-
tion. If the hint turns out to be wrong, the prediction bit is inverted and stored back.
This buffer is effectively a cache where every access is a hit, and, as we will see, the per-
formance of the buffer depends on both how often the prediction is for the branch of interest
and how accurate the prediction is when it matches. Before we analyze the performance, it is
useful to make a small, but important, improvement in the accuracy of the branch-prediction
scheme.
This simple 1-bit prediction scheme has a performance shortcoming: Even if a branch is al-
most always taken, we will likely predict incorrectly twice, rather than once, when it is not
taken, since the misprediction causes the prediction bit to be lipped.
To remedy this weakness, 2-bit prediction schemes are often used. In a 2-bit scheme, a pre-
diction must miss twice before it is changed.
Figure C.18
shows the finite-state processor for a
2-bit prediction scheme.
FIGURE C.18
The states in a 2-bit prediction scheme
. By using 2 bits rather than 1, a
branch that strongly favors taken or not taken—as many branches do—will be mispredicted
less often than with a 1-bit predictor. The 2 bits are used to encode the four states in the sys-
tem. The 2-bit scheme is actually a specialization of a more general scheme that has an
n
-bit
saturating counter for each entry in the prediction buffer. With an
n
-bit counter, the counter
can take on values between 0 and 2
n
- 1: When the counter is greater than or equal to one-
half of its maximum value (2
n
- 1), the branch is predicted as taken; otherwise, it is predicted
as untaken. Studies of
n
-bit predictors have shown that the 2-bit predictors do almost as well,
thus most systems rely on 2-bit branch predictors rather than the more general
n
-bit predict-
ors.
A branch-prediction buffer can be implemented as a small, special “cache” accessed with
the instruction address during the IF pipe stage, or as a pair of bits atached to each block in
the instruction cache and fetched with the instruction. If the instruction is decoded as a branch
and if the branch is predicted as taken, fetching begins from the target as soon as the PC is
Search WWH ::
Custom Search