Information Technology Reference
In-Depth Information
are 0, A 1,2 x 2 + A 1,3 x 3 ,0, A 2,2 x 2 + A 2,3 x 3 ,0, A 3,2 x 2 + A 3,3 x 3 , 0. The successive values of S 1
are 0, 0, A 1,1 x 1 + A 1,2 x 2 + A 1,3 x 3 ,0, A 2,1 x 1 + A 2,2 x 2 + A 2,3 x 3 ,0, A 3,1 x 1 + A 3,2 x 2 + A 3,3 x 3 .
The algorithm described above to compute the matrix-vector product for a 3
×
3matrix
clearly extends to arbitrary n
×
n matrices. (See Problem 7.8 .) Since the last element of an
n
×
n matrix arrives at the array after 3 n
2 time steps, such an array will complete its task in
3 n
1 time steps. A lower bound on the time for this problem (see Problem 7.9 )canbederived
by showing that the n 2 entries of the matrix A and the n entries of the matrix x must be read
to compute A x correctly by an algorithm, whether serial or not. By Theorem 7.4.1 it follows
that all systolic algorithms using n processors require n steps. Thus, the above algorithm is
nearly optimal to within a constant multiplicative factor.
THEOREM 7.5.1 There exists a linear systolic array with n cells that computes the product of an
n
×
n matrix with an n -vector in 3 n
1 steps, and no algorithm on such an array can do this
computation in fewer than n steps.
Since the product of two n
×
n matrices can be realized as n matrix-vector products with
an n
n matrix, an n -processor systolic array exists that can multiply two matrices nearly
optimally.
×
7.5.2 Sorting on Linear Arrays
A second application of linear systolic arrays is bubble sorting of integers. A sequential version
of the bubble sort algorithm passes over the entries in a tuple ( x 1 , x 2 , ... , x n ) from left to
right multiple times. On the first pass it finds the largest element and moves it to the rightmost
position. It applies the same procedure to the first n
1 elements of the resultant list, stopping
when it finds a list containing one element. This sequential procedure takes time proportional
to n +( n
+ 2 + 1 = n ( n + 1 ) / 2.
A parallel version of bubble sort, sometimes called odd-even transposition sort ,isnatu-
rally realized on a linear systolic array. The n entries of the array are placed in n cells. Let c i
be the word in the i th cell. We assume that in one unit of time two adjacent cells can read
words stored in each other's memories ( c i and c i + 1 ), compare them, and swap them if one ( c i )
is larger than the other ( c i + 1 ). The odd-even transposition sort algorithm executes n stages.
In the even-numbered stages, integers in even-numbered cells are compared with integers in
the next higher numbered cells and swapped, if larger. In the odd-numbered stages, the same
operation is performed on integers in odd-numbered cells. (See Fig. 7.9 .) We show that in n
steps the sorting is complete.
1 )+( n
2 )+
···
THEOREM 7.5.2 Bubble sort of n elements on a linear systolic array can be done in at most n steps.
Every algorithm to sort a list of n elements on a linear systolic array requires at least n
1 steps.
Thus, bubble sort on a linear systolic array is almost optimal.
Proof To derive the upper bound we use the zero-one principle (see Theorem 6.8.1 ), which
states that if a comparator network for inputs over an ordered set
correctly sorts all binary
inputs, it correctly sorts all inputs. The bubble sort systolic array maps directly to a com-
parator network because each of its operations is data-independent, that is, oblivious .To
see that the systolic array correctly sorts binary sequences, consider the position, r ,ofthe
rightmost 1 in the array.
A
Search WWH ::




Custom Search