Information Technology Reference
In-Depth Information
φ =( 1, 0, 0, 1, 0, 1 )
y =( 2, 3, 6, 7, 12, 1 )
As shown in Problem 2.21 , a segmented prefix computation is a special case of a general prefix
computation. This is demonstrated by defining a new associative operation
on value-flag
pairs that returns another value-flag pair.
2.6.1 An Efficient Parallel Prefix Circuit
A circuit for the prefix function
( n )
P
can be realized with O ( n 2 ) instances of
if for each
j
n we naively realize y j = x 1
x 2 ···
x j with a separate circuit containing j
1
1
instances of
. If each such circuit is organized as a balanced binary tree, the depth of the
( n )
P
is the depth of the circuit for y n ,whichis
log 2 n
circuit for
.Thisisaparallelcircuit
for the prefix problem but uses many more operators than necessary. We now describe a much
more efficient circuit for this problem; it uses O ( n ) instances of
and has depth O (log n ) .
To describe this improved circuit, we let x [ r , r ]= x r and for r
s let x [ r , s ]= x r
( n )
x r + 1 ···
x s .Thenwecanwrite
P
( x ) = y where y j = x [ 1, j ] .
t< s .
We use this fact to construct the improved circuit. Let n = 2 k . Observe that if we form the
( n/ 2 ) -tuple ( x [ 1, 2 ] , x [ 3, 4 ] , x [ 5, 6 ] , ... , x [ 2 k
is associative, we observe that x [ r , s ]= x [ r , t ]
x [ t + 1, s ] for r
Because
1, 2 k ]) using the rule x [ i , i + 1 ]= x [ i , i ]
x [ i + 1, i + 1 ] for i odd and then do a prefix computation on it, we obtain the ( n/ 2 ) -tuple
( x [ 1, 2 ] , x [ 1, 4 ] , x [ 1, 6 ] , ... , x [ 1, 2 k ]) .Thisisalmostwhatisneeded.Wemustonlycompute
x [ 1, 1 ] , x [ 1, 3 ] , x [ 1, 5 ] , ... , x [ 1, 2 k
1 ] , which is easily done using the rule x [ 1, 2 i + 1 ]=
2 k− 1
x [ 1, 2 i ]
1. (See Fig. 2.13 .) The base case for this construction is
that of n = 1, for which y 1 = x 1 and no operations are needed.
If C ( k ) is the size of this circuit on n = 2 k
x 2 i + 1 for 1
i
inputs and D ( k ) is its depth, then C ( 0 )= 0,
D ( 0 )= 0and C ( k ) and D ( k ) for k
1 satisfy the following recurrences:
1 )+ 2 k
C ( k )= C ( k
1
D ( k )= D ( k
1 )+ 2
As a consequence, we have the following result.
THEOREM 2.6.1 For n = 2 k , k an integer, the parallel prefix function P
( n )
A
n
→A
n on an
:
n -vector with associative operator
can be implemented by a circuit with the following size and
depth bounds over the basis Ω=
{}
:
C Ω
( n )
P
2 n
log 2 n
2
D Ω
( n )
P
2 log 2 n
Proof The solution to the recurrence on C ( k ) is C ( k )= 2 k + 1
k
2, as the reader can
easily show. It satisfies the base case of k = 0 and the general case as well. The solution to
D ( k ) is D ( k )= 2 k .
When n isnotapowerof2,wecanstartwithacircuitforthenexthigherpowerof2and
then delete operations and edges that are not used to produce the first n outputs.
Search WWH ::




Custom Search