Information Technology Reference
In-Depth Information
that executes O (log n ) steps. Thus, a permutation is computed much more quickly (in time
O (log n ) )withtheBenes offline permutation network than it can be done on Batcher's online
bitonic sorting network (in time O (log 2 n ) ). However, the Benes network requires time to
collect the destinations at some central location, compute the switch settings, and transmit
them to the switches themselves.
To understand how the Benes network works, we provide an alternative characterization
of it. Let P n be the Beneˇsnetworkon n inputs, n = 2 k , defined as back-to-back FFT graphs
with nodes replaced by switches. Then P n may be defined recursively, as suggested in Fig. 7.20 .
P n is obtained by making two copies of P n/ 2 ,placing n/ 2 copies of a two-input, two-output
switch at the input and the same number at the output. For 1
n/ 2)
the top output of switch i is connected to the top input of the i th switch in the upper (lower)
copy of P n/ 2 and the bottom output is connected to the bottom input of the i th switch in the
lower (upper) copy of P n/ 2 . The connections of output switches are the mirror image of the
connections of the input switches.
Consider the Beneˇsnetwork P 2 . It consists of a single switch and generates the two possible
permutations of the inputs. We show by induction that P n generates all n ! permutations of its
n inputs. Assume that this property holds for n = 2, 4, ... ,2 k− 1 . We show that it holds for
m = 2 k .Let π =( π ( 1 ) , π ( 2 ) , ... , π ( m )) be an arbitrary permutation to be realized by P m .
This means that the i th input must be connected to the π ( i ) th output. Suppose that π ( 3 ) is
2, as shown in Fig. 7.20 . We can arbitrarily choose to have the third input pass through the
first or second copy of P m/ 2 . We choose the second. The path taken through the second copy
of P m/ 2 must emerge on its second output so that it can then pass to the first switch in the
column of output switches. This output switch must pass its inputs without swapping them.
The other output of this switch, namely 1, must arrive via a path through the first copy of
P m/ 2 and emerge on its first output. To determine the input at which it must arrive, we find
the input of P m associated with the output of 1 and set its switch so that it is directed to the
first copy of P m/ 2 . Since the other input to this input switch must go to the other copy of
P m/ 2 , we follow its path through P m to the output and then reason in the same way about the
other output at the output switch at which it arrives. If by tracing paths back and forth this
way we do not exhaust all inputs and outputs, we pick another input and repeat the process
until all inputs have been routed to outputs.
Now let's determine the number of switches, S ( k ) ,inaBeneˇsnetwork P n on n = 2 k
inputs. It follows that S ( 1 )= 1and
i
n/ 4( n/ 4 + 1
i
1 )+ 2 k
S ( k )= 2 S ( k
1
2 ) 2 k = n (log 2 n
1
It is straightforward to show that S ( k )=( k
2 ) .
Although a global permutation network sends messages to their destinations more quickly
than a local permutation network, the switch settings must be computed and distributed glob-
ally, both of which impose important limitations on the time to realize particular permutations.
7.9 The PRAM Model
The parallel random-access machine (PRAM) (see Fig. 7.21 ), the canonical structured par-
allel machine, consists of a bounded set of processors and a common memory containing a
potentially unlimited number of words. Each processor is similar to the random-access ma-
chine (RAM) described in Section 3.4 except that its CPU can access locations in both its local
Search WWH ::




Custom Search