Java Reference
In-Depth Information
19. It is sometimes the case that we need to schedule a small block of code
that forms the body of a frequently executed loop. For example
for (i=2; a <1000000; i++)
a[i] = a[i-1]*a[i-2]/1000.0;
Operations like floating point multiplication and division sometimes
have significant delays (5 or more cycles). If a loop body is small, code
scheduling cannot do much; there are not enough instructions to cover
all the delays. In such a situation loop unrolling may help. The body
of loop is replicated n times, with loop indices and loop limits suitably
modified. For example, with n =
2, the above loop would become
for (i=2; a <999999; i+=2){
a[i] =
a[i-1]*a[i-2]/1000.0;
a[i+1] =
a[i]*a[i-1]/1000.0;}
A larger loop body gives a code scheduler more instructions that can be
placed after instructions that may stall. How can we determine the value
of n (the loop unrolling factor ) necessary to cover all (or most) of the
delays in a loop body? What factors limit how large n (or an unrolled
loop body) should be allowed to become?
20. The
DAG code scheduler is very optimistic with respect to
loads. It schedules them assuming that they always hit in the primary
cache. Real loads are not always so cooperative. Assume we can identify
load instructions most likely to miss. How should
schedule
DAG be
modified to use “probability of cache miss” information in scheduling
instructions?
schedule
21. Assume we extend the IR tree patterns defined in Figure 13.27 with the
patterns for the MIPS add and load-immediate instructions shown in
Figure 13.35.
Show how the IR tree of Figure 13.36, corresponding to
A[i+1] = (1+i)*1000,
would be matched. What MIPS instructions would be generated?
 
Search WWH ::




Custom Search