Hardware Reference
In-Depth Information
Branch PC (word address) Outcome
454
T
543
NT
777
NT
543
NT
777
NT
454
T
777
NT
454
T
543
T
3.18 [10] <3.9> Suppose we have a deeply pipelined processor, for which we implement a
branch-target buffer for the conditional branches only. Assume that the misprediction pen-
alty is always four cycles and the buffer miss penalty is always three cycles. Assume a 90%
hit rate, 90% accuracy, and 15% branch frequency. How much faster is the processor with
the branch-target buffer versus a processor that has a fixed two-cycle branch penalty? As-
sume a base clock cycle per instruction (CPI) without branch stalls of one.
3.19 [10/5] <3.9> Consider a branch-target buffer that has penalties of zero, two, and two clock
cycles for correct conditional branch prediction, incorrect prediction, and a buffer miss, re-
spectively. Consider a branch-target buffer design that distinguishes conditional and un-
conditional branches, storing the target address for a conditional branch and the target in-
struction for an unconditional branch.
a. [10] <3.9> What is the penalty in clock cycles when an unconditional branch is found
in the buffer
b. [10] <3.9> Determine the improvement from branch folding for unconditional
branches. Assume a 90% hit rate, an unconditional branch frequency of 5%, and a two-
cycle penalty for a buffer miss. How much improvement is gained by this enhance-
ment? How high must the hit rate be for this enhancement to provide a performance
gain?
Search WWH ::




Custom Search