Information Technology Reference
In-Depth Information
branch address is known. Consider, for example, the execution of the following
program loop on a pipeline consisting of two stages: Fetch (F) and Execute (E).
I 1 ! Again: Load 5, R 1 ;
R 1
5;
I 2 !
Sub R 2 ;
R 2 R 2 2
1;
I 3 !
Bnn Again;
Branch to Again if result is Not Negative;
I 4 !
Add R 4 , R 5 , R 3 ;
R 3 R 4 þ R 5 ;
It should be noted that at the end of the first loop, either instruction I 1 or instruction
I 4 will have to be fetched depending on the result of executing instruction I 3 .The
way with which such a situation has been dealt will delay fetching of the next
instruction until the result of executing instruction I 3 is known. This will lead to
incurring extra delay in the pipeline. However, this extra delay may be avoided if
the sequence of instructions has been reordered to become as follows.
Again:
Sub R 2 ;
R 2 R 2 2
1;
Load 5, R 1 ;
R 1 5;
Bnn Again;
Branch to Again if result is Not Negative;
Add R 4 , R 5 , R 3 ;
R 3 R 4 þ R 5 ;
Figure 9.13 shows the Gantt's chart for executing the modified piece of code for the
case R 2 ¼
3 before entering the loop.
The figure indicates that branching takes place one instruction later than the
actual place where the branch instruction appears in the original instruction
sequence, hence the name “delayed branch.” It is also clear from Figure 9.13 that
by reordering the sequence of instructions, it was possible to fill the branch delay
time slot with a useful instruction, thus eliminating any extra delay in the pipeline.
It has been shown in a number of studies that “smart” compilers were able to make
use of one branch delay time slot in more than 80% of the cases. The use of branch
delay time slots has led to the improvement of both the speed-up and the throughput
of those processors using “smart” compilers.
PREDICTION OF THE NEXT INSTRUCTION TO FETCH This method tries to reduce the
time unit(s) that can potentially be lost due to instruction dependency by predicting
the next instruction to fetch after fetching a conditional branch instruction. The basis
is that if the branch outcomes are random, then it would be possible to save about
50% of the otherwise lost time. A simple way to carry out such a technique is to
E
F
I 2
I 1
I 3
I 2
I 1
I 3
I 2
I 1
I 3
I 2
I 1
I 3
I 4
I 2
I 1
I 3
I 2
I 1
I 3
I 2
I 1
I 3
I 2
I 1
I 3
I 4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Figure 9.13 Delayed branch
Search WWH ::




Custom Search