Hardware Reference
In-Depth Information
FIGURE C.14 Scheduling the branch delay slot . The top box in each pair shows the code
before scheduling; the bottom box shows the scheduled code. In (a), the delay slot is sched-
uled with an independent instruction from before the branch. This is the best choice.
Strategies (b) and (c) are used when (a) is not possible. In the code sequences for (b) and (c),
the use of R1 in the branch condition prevents the DADD instruction (whose destination is R1)
from being moved after the branch. In (b), the branch delay slot is scheduled from the target
of the branch; usually the target instruction will need to be copied because it can be reached
by another path. Strategy (b) is preferred when the branch is taken with high probability, such
as a loop branch. Finally, the branch may be scheduled from the not-taken fall-through as in
(c). To make this optimization legal for (b) or (c), it must be OK to execute the moved instruc-
tion when the branch goes in the unexpected direction. By OK we mean that the work is
wasted, but the program will still execute correctly. This is the case, for example, in (c) if R7
were an unused temporary register when the branch goes in the unexpected direction.
The limitations on delayed-branch scheduling arise from: (1) the restrictions on the instruc-
tions that are scheduled into the delay slots, and (2) our ability to predict at compile time
whether a branch is likely to be taken or not. To improve the ability of the compiler to ill
branch delay slots, most processors with conditional branches have introduced a canceling or
nullifying branch. In a canceling branch, the instruction includes the direction that the branch
was predicted. When the branch behaves as predicted, the instruction in the branch delay slot
is simply executed as it would normally be with a delayed branch. When the branch is incor-
rectly predicted, the instruction in the branch delay slot is simply turned into a no-op.
 
Search WWH ::




Custom Search