Information Technology Reference
In-Depth Information
Fetch
Stage
S #1
Slot
Assignment
Stage S #2
Rename
Stage
S #3
Issue
Stage
S #4
Register
Read
Stage S #5
Execute
Stage
S #6
Memory
Stage
S #7
Figure 10.9 The 21264 instruction pipeline
cache. This technique achieves more than an 85% hit ratio. The misprediction pen-
alty is a single cycle. The 21264 uses speculative branch prediction. Branch predic-
tion in the 21264 is a two-level scheme. It is based on the observation that branches
exhibit both local and global correlation. Local correlation makes use of the
branch's past behavior. Global correlation, on the other hand, makes use of the
past behavior of all previous branches. The combined local
global prediction
used in the 21264 correlates the branch behavior pattern with local branch history,
that is, the execution of a single branch at a unique PC location, and global branch
history, that is, the execution of all previous branches. The scheme dynamically
selects between local and global branch history (Fig. 10.11).
The local branch predictor has two tables. The first is a 1024
/
10 local history
table in which each entry holds a 10-bit local history of the selected branch over
the last executions. The local history table is indexed by the instruction address
(using the PC). The second table is a 1024
3 local prediction table in which each
entry has a 3-bit saturating counter to predict the branch outcome. After branches'
retirement, the 21264 updates the local history table with the true branch direction
and the referenced counter. This enhances the possibility for correct prediction and
is called predictor training.
The global branch predictor has a 4096
2 global prediction table in which each
entry holds a 2-bit saturating counter. It keeps track of the global history of the last
12 branches. The global branch prediction table is indexed by a 4096
2 choice pre-
diction table. After branches' retirement, the 21264 updates the referenced global
prediction counter, enhancing the possibility for correct prediction.
Local prediction is useful in the case of an alternating taken
not-taken sequence
of a given branch. In this case, the local history of the branch will eventually resolve
to a pattern of ten alternating zeros and ones indicating the success, or failure, of the
branch on alternate encounters. As the branch executes multiple times, it saturates
the prediction counters corresponding to the local history values and hence makes
the prediction correct.
/
Local
Branch
Predictor
Instruction
Global
Cache
Block/Set
Prediction
64 KB 2-way
Figure 10.10 The 21264 fetch stage
Search WWH ::




Custom Search