Parallel Computation - Models of Computation: Exploring the Power of Computing

Information Technology Reference

In-Depth Information

that executes O (log n ) steps. Thus, a permutation is computed much more quickly (in time

O (log n ) )withtheBenes offline permutation network than it can be done on Batcher's online

bitonic sorting network (in time O (log 2 n ) ). However, the Benes network requires time to

collect the destinations at some central location, compute the switch settings, and transmit

them to the switches themselves.

To understand how the Benes network works, we provide an alternative characterization

of it. Let P n be the Beneˇsnetworkon n inputs, n = 2 k , defined as back-to-back FFT graphs

with nodes replaced by switches. Then P n may be defined recursively, as suggested in Fig. 7.20 .

P n is obtained by making two copies of P n/ 2 ,placing n/ 2 copies of a two-input, two-output

switch at the input and the same number at the output. For 1

n/ 2)

the top output of switch i is connected to the top input of the i th switch in the upper (lower)

copy of P n/ 2 and the bottom output is connected to the bottom input of the i th switch in the

lower (upper) copy of P n/ 2 . The connections of output switches are the mirror image of the

connections of the input switches.

Consider the Beneˇsnetwork P 2 . It consists of a single switch and generates the two possible

permutations of the inputs. We show by induction that P n generates all n ! permutations of its

n inputs. Assume that this property holds for n = 2, 4, ... ,2 k− 1 . We show that it holds for

m = 2 k .Let π =( π ( 1 ) , π ( 2 ) , ... , π ( m )) be an arbitrary permutation to be realized by P m .

This means that the i th input must be connected to the π ( i ) th output. Suppose that π ( 3 ) is

2, as shown in Fig. 7.20 . We can arbitrarily choose to have the third input pass through the

first or second copy of P m/ 2 . We choose the second. The path taken through the second copy

of P m/ 2 must emerge on its second output so that it can then pass to the first switch in the

column of output switches. This output switch must pass its inputs without swapping them.

The other output of this switch, namely 1, must arrive via a path through the first copy of

P m/ 2 and emerge on its first output. To determine the input at which it must arrive, we find

the input of P m associated with the output of 1 and set its switch so that it is directed to the

first copy of P m/ 2 . Since the other input to this input switch must go to the other copy of

P m/ 2 , we follow its path through P m to the output and then reason in the same way about the

other output at the output switch at which it arrives. If by tracing paths back and forth this

way we do not exhaust all inputs and outputs, we pick another input and repeat the process

until all inputs have been routed to outputs.

Now let's determine the number of switches, S ( k ) ,inaBeneˇsnetwork P n on n = 2 k

inputs. It follows that S ( 1 )= 1and

≤

n/ 4( n/ 4 + 1

≤

1 )+ 2 k

S ( k )= 2 S ( k

−

2 ) 2 k = n (log 2 n

It is straightforward to show that S ( k )=( k

2 ) .

Although a global permutation network sends messages to their destinations more quickly

than a local permutation network, the switch settings must be computed and distributed glob-

ally, both of which impose important limitations on the time to realize particular permutations.

−

7.9 The PRAM Model

The parallel random-access machine (PRAM) (see Fig. 7.21 ), the canonical structured par-

allel machine, consists of a bounded set of processors and a common memory containing a

potentially unlimited number of words. Each processor is similar to the random-access ma-

chine (RAM) described in Section 3.4 except that its CPU can access locations in both its local

Models of Computation: Exploring the Power of Computing

Search WWH ::

Custom Search

Home