Systolic Arrays - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

Data Dependence Constraint

If index j can be reached from index i by following the path consisting of one

or more dependence vectors, then H( j ) should be scheduled after H( i ). That is, if

there exists a vector m consisting of non-negative integers such that

if j

=

i

+

Dm

,

then s

(

j

) >

s

(

i

)

(12)

where D is the dependence matrix.

Since a systolic array often assumes a one or two dimensional regular configu-

ration (cf. Fig. 1 ) , the PE index p( i ) can be associated with the lattice point in

a PE index space just as each loop body in a loop nest is associated with an

index point in the DG. To ensure the resulting systolic array features local inter-

processor communication, the localized dependence vectors in the DG should

not require global communication after the PE assignment i

→

p( i ). A somewhat

restrictive constraint to enforce this requirement would be

Local Mapping Constraint

If j

−

i

=

d k (a dependence vector), then

p

(

j

) −

p

(

i

) 1 ≤

d k 1 .

(13)

A number of performance metrics may be defined to compare the merits of

different systolic array implementations. These include

Total Computing Time:

T C =

max i , j ∈ DG (

t

(

i

) −

t

(

j

)) .

(14)

PE Utilization:

U PE =

N DG / (

T C ×

N PE )

(15)

where N DG is the number of index points in the DG ,and N PE is the number of

PEs in the systolic array.

Now we are ready to formally state the systolic array mapping and scheduling

problem:

Systolic Array Mapping and Scheduling Problem

Given a localized, shift invariant DG, and a systolic array configuration, find

a PE assignment mapping p( i ), and a schedule t( i ) such that the performance

is optimized, namely, the total computing time T C is minimized, and the PE

utilization U PE is maximized; subject to (i) the resource constraint, (ii) the data

dependence constraint, and (iii) the local mapping constraints.

Thus the systolic array implementation is formulated as a discrete constrained

optimization problem. By fully exploiting of the regular (shift invariant) structure

of both the DG and the systolic array, this problem can be further simplified.

Signal Processing Systems

Search WWH ::

Custom Search

Home