Hardware Reference
In-Depth Information
Given the values
a
= 2,
b
= 3,
c
= 2, and
d
= 0, then GCD(
a,c
) = 2, and
d
−
b
= −3.
Since 2 does not divide −3, no dependence is possible.
The GCD test is sufficient to guarantee that no dependence exists; however, there are cases
where the GCD test succeeds but no dependence exists. This can arise, for example, because
the GCD test does not consider the loop bounds.
In general, determining whether a dependence actually exists is NP-complete. In practice,
however, many common cases can be analyzed precisely at low cost. Recently, approaches us-
ing a hierarchy of exact tests increasing in generality and cost have been shown to be both
accurate and efficient. (A test is
exact
if it precisely determines whether a dependence exists.
Although the general case is NP-complete, there exist exact tests for restricted situations that
are much cheaper.)
In addition to detecting the presence of a dependence, a compiler wants to classify the type
of dependence. This classification allows a compiler to recognize name dependences and elim-
inate them at compile time by renaming and copying.
Example
The following loop has multiple types of dependences. Find all the true de-
pendences, output dependences, and antidependences, and eliminate the out-
put dependences and antidependences by renaming.
for (i=0; i<100; i=i+1) {
Y[i] = X[i] / c; /* S1 */
X[i] = X[i] + c; /* S2 */
Z[i] = Y[i] + c; /* S3 */
Y[i] = c − Y[i]; /* S4 */
}
Answer
The following dependences exist among the four statements:
1. There are true dependences from
S1
to
S3
and from
S1
to
S4
because of
Y[i]
.
These are not loop carried, so they do not prevent the loop from being con-
sidered parallel. These dependences will force
S3
and
S4
to wait for S1 to
complete.
2. There is an antidependence from
S1
to
S2
, based on
X[i]
.
3. There is an antidependence from
S3
to
S4
for
Y[i]
.
4. There is an output dependence from
S1
to
S4
, based on
Y[i]
.
The following version of the loop eliminates these false (or pseudo) depend-
ences.
for (i=0; i<100; i=i+1 {
T[i] = X[i] / c; /* Y renamed to T to remove output dependence */
Search WWH ::
Custom Search