Parallel Computation - Models of Computation: Exploring the Power of Computing

Information Technology Reference

In-Depth Information

An immediate consequence of Theorems 7.7.1 and 7.9.1 is that a list of n items can be

sorted on an n -processor PRAM in O (log 2 n ) steps by a normal oblivious algorithm. Data-

dependent sorting algorithms for the hypercube exist with running time O (log n ) .

It also follows from Section 7.6.1 that algorithms for trees, linear arrays, and meshes trans-

late directly into PRAM algorithms with the same running time as on these less general models.

Of course, the superior connectivity between PRAM processors might be used to produce faster

algorithms.

7.9.2 The Power of Concurrency

The CRCW PRAM is a very powerful model. As we show, any Boolean function can be

computed with it in a constant number of steps if a sufficient number of processors is available.

For this reason, the CRCW PRAM is of limited interest: it represents an extreme that does

not reflect reality as we know it. The CREW and EREW PRAMs are more realistic. We

first explore the power of the CRCW and then show that an EREW PRAM can simulate a

p -processor CRCW PRAM with a slowdown by a factor of O (log 2 p ) .

THEOREM 7.9.2 The CRCW PRAM can compute an arbitrary Boolean function in four steps.

Proof GivenaBooleanfunction f :

n

B

→B

, represent it by its disjunctive normal form;

that is, represent it as the OR of its minterms where a minterm is the AND of each literal of

f . (A literal is a variable, x i , or its complement, x i .) Assume that each variable is stored in

a separate location in the common memory.

Given a minterm, we show that it can be computed by a CRCW PRAM in two steps.

Assign one location in the common memory to the minterm and initialize it to the value 1.

Assign one processor to each literal in the minterm. The processor assigned to the j th literal

reads the value of the j th variable from the common memory. If the value of the literal is 0,

this processor writes the value 0 to the memory location associated with the minterm. Thus,

the minterm has value 1 exactly when each literal has value 1. Note that these processors read

concurrently with processors associated with other minterms and may write concurrently if

more than one of their literals has value 0.

Now assume that a common memory location has been reserved for the function itself

and initialized to 0. One processor is assigned to each minterm and if the value of its

minterm is 1, it writes the value 1 in the location associated with the function. Thus, in two

more steps the function f is computed.

Given the power of concurrency, especially as applied to writing, we now explore the cost

in performance of not allowing concurrency, whether in reading or writing.

THEOREM 7.9.3 A p -processor priority CRCW PRAM can be simulated by a p -processor EREW

PRAM with a slowdown by a factor equal to the time to sort p elements on this machine. Conse-

quently, this simulation can be done by a normal algorithm with a slowdown factor of O (log 2 p ) .

Proof The j th EREW PRAM processor simulates a memory access by the j th CRCW

PRAM processor by first writing into a special location, M j ,apair ( a j , j ) indicating that

processor j wishes to access (read or write) location a j . If processors are writing to common

memory, the value to be written is attached to this pair. If processors are reading from

common memory, a return message containing the requested value is provided. If a processor

chooses not to access any location, a dummy address larger than all other addresses is used for

Models of Computation: Exploring the Power of Computing

Search WWH ::

Custom Search

Home