Digital Signal Processing Reference
In-Depth Information
Data Dependence Constraint
If index j can be reached from index i by following the path consisting of one
or more dependence vectors, then H( j ) should be scheduled after H( i ). That is, if
there exists a vector m consisting of non-negative integers such that
if j
=
i
+
Dm
,
then s
(
j
) >
s
(
i
)
(12)
where D is the dependence matrix.
Since a systolic array often assumes a one or two dimensional regular configu-
ration (cf. Fig. 1 ) , the PE index p( i ) can be associated with the lattice point in
a PE index space just as each loop body in a loop nest is associated with an
index point in the DG. To ensure the resulting systolic array features local inter-
processor communication, the localized dependence vectors in the DG should
not require global communication after the PE assignment i
p( i ). A somewhat
restrictive constraint to enforce this requirement would be
Local Mapping Constraint
If j
i
=
d k (a dependence vector), then
p
(
j
)
p
(
i
) 1
d k 1 .
(13)
A number of performance metrics may be defined to compare the merits of
different systolic array implementations. These include
Total Computing Time:
T C =
max i , j DG (
t
(
i
)
t
(
j
)) .
(14)
PE Utilization:
U PE =
N DG / (
T C ×
N PE )
(15)
where N DG is the number of index points in the DG ,and N PE is the number of
PEs in the systolic array.
Now we are ready to formally state the systolic array mapping and scheduling
problem:
Systolic Array Mapping and Scheduling Problem
Given a localized, shift invariant DG, and a systolic array configuration, find
a PE assignment mapping p( i ), and a schedule t( i ) such that the performance
is optimized, namely, the total computing time T C is minimized, and the PE
utilization U PE is maximized; subject to (i) the resource constraint, (ii) the data
dependence constraint, and (iii) the local mapping constraints.
Thus the systolic array implementation is formulated as a discrete constrained
optimization problem. By fully exploiting of the regular (shift invariant) structure
of both the DG and the systolic array, this problem can be further simplified.
Search WWH ::




Custom Search