Hardware Reference
In-Depth Information
Given the values a = 2, b = 3, c = 2, and d = 0, then GCD( a,c ) = 2, and d b = −3.
Since 2 does not divide −3, no dependence is possible.
The GCD test is sufficient to guarantee that no dependence exists; however, there are cases
where the GCD test succeeds but no dependence exists. This can arise, for example, because
the GCD test does not consider the loop bounds.
In general, determining whether a dependence actually exists is NP-complete. In practice,
however, many common cases can be analyzed precisely at low cost. Recently, approaches us-
ing a hierarchy of exact tests increasing in generality and cost have been shown to be both
accurate and efficient. (A test is exact if it precisely determines whether a dependence exists.
Although the general case is NP-complete, there exist exact tests for restricted situations that
are much cheaper.)
In addition to detecting the presence of a dependence, a compiler wants to classify the type
of dependence. This classification allows a compiler to recognize name dependences and elim-
inate them at compile time by renaming and copying.
Example
The following loop has multiple types of dependences. Find all the true de-
pendences, output dependences, and antidependences, and eliminate the out-
put dependences and antidependences by renaming.
for (i=0; i<100; i=i+1) {
Y[i] = X[i] / c; /* S1 */
X[i] = X[i] + c; /* S2 */
Z[i] = Y[i] + c; /* S3 */
Y[i] = c − Y[i]; /* S4 */
}
Answer
The following dependences exist among the four statements:
1. There are true dependences from S1 to S3 and from S1 to S4 because of Y[i] .
These are not loop carried, so they do not prevent the loop from being con-
sidered parallel. These dependences will force S3 and S4 to wait for S1 to
complete.
2. There is an antidependence from S1 to S2 , based on X[i] .
3. There is an antidependence from S3 to S4 for Y[i] .
4. There is an output dependence from S1 to S4 , based on Y[i] .
The following version of the loop eliminates these false (or pseudo) depend-
ences.
for (i=0; i<100; i=i+1 {
T[i] = X[i] / c; /* Y renamed to T to remove output dependence */
Search WWH ::




Custom Search