Hardware Reference
In-Depth Information
1. There is no dependence from S1 to S2 . If there were, then there would be
a cycle in the dependences and the loop would not be parallel. Since this
other dependence is absent, interchanging the two statements will not af-
fect the execution of S2 .
2. On the first iteration of the loop, statement S2 depends on the value of B[0]
computed prior to initiating the loop.
These two observations allow us to replace the loop above with the following
code sequence:
A[0] = A[0] + B[0];
for (i=0; i<99; i=i+1) {
B[i+1] = C[i] + D[i];
A[i+1] = A[i+1] + B[i+1];
}
B[100] = C[99] + D[99];
The dependence between the two statements is no longer loop carried, so that
iterations of the loop may be overlapped, provided the statements in each iter-
ation are kept in order.
Our analysis needs to begin by finding all loop-carried dependences. This dependence in-
formation is inexact , in the sense that it tells us that such dependence may exist. Consider the
following example:
for (i=0;i<100;i=i+1) {
A[i] = B[i] + C[i]
D[i] = A[i] * E[i]
}
The second reference to A in this example need not be translated to a load instruction, since
we know that the value is computed and stored by the previous statement; hence, the second
reference to A can simply be a reference to the register into which A was computed. Perform-
ing this optimization requires knowing that the two references are always to the same memory
address and that there is no intervening access to the same location. Normally, data depend-
ence analysis only tells that one reference may depend on another; a more complex analysis is
required to determine that two references must be to the exact same address. In the example
above, a simple version of this analysis suffices, since the two references are in the same basic
block.
Often loop-carried dependences are in the form of a recurrence . A recurrence occurs when
a variable is defined based on the value of that variable in an earlier iteration, often the one
immediately preceding, as in the following code fragment:
for (i=1;i<100;i=i+1) {
Y[i] = Y[i−1] + Y[i];
}
Detecting a recurrence can be important for two reasons: Some architectures (especially vec-
tor computers) have special support for executing recurrences, and, in an ILP context, it may
still be possible to exploit a fair amount of parallelism.
Search WWH ::




Custom Search