Pipelining: Basic and Intermediate Concepts - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

FIGURE C.14 Scheduling the branch delay slot . The top box in each pair shows the code

before scheduling; the bottom box shows the scheduled code. In (a), the delay slot is sched-

uled with an independent instruction from before the branch. This is the best choice.

Strategies (b) and (c) are used when (a) is not possible. In the code sequences for (b) and (c),

the use of R1 in the branch condition prevents the DADD instruction (whose destination is R1)

from being moved after the branch. In (b), the branch delay slot is scheduled from the target

of the branch; usually the target instruction will need to be copied because it can be reached

by another path. Strategy (b) is preferred when the branch is taken with high probability, such

as a loop branch. Finally, the branch may be scheduled from the not-taken fall-through as in

(c). To make this optimization legal for (b) or (c), it must be OK to execute the moved instruc-

tion when the branch goes in the unexpected direction. By OK we mean that the work is

wasted, but the program will still execute correctly. This is the case, for example, in (c) if R7

were an unused temporary register when the branch goes in the unexpected direction.

The limitations on delayed-branch scheduling arise from: (1) the restrictions on the instruc-

tions that are scheduled into the delay slots, and (2) our ability to predict at compile time

whether a branch is likely to be taken or not. To improve the ability of the compiler to ill

branch delay slots, most processors with conditional branches have introduced a canceling or

nullifying branch. In a canceling branch, the instruction includes the direction that the branch

was predicted. When the branch behaves as predicted, the instruction in the branch delay slot

is simply executed as it would normally be with a delayed branch. When the branch is incor-

rectly predicted, the instruction in the branch delay slot is simply turned into a no-op.

Search WWH ::

Custom Search

Home