Information Technology Reference
In-Depth Information
An immediate consequence of Theorems 7.7.1 and 7.9.1 is that a list of n items can be
sorted on an n -processor PRAM in O (log 2 n ) steps by a normal oblivious algorithm. Data-
dependent sorting algorithms for the hypercube exist with running time O (log n ) .
It also follows from Section 7.6.1 that algorithms for trees, linear arrays, and meshes trans-
late directly into PRAM algorithms with the same running time as on these less general models.
Of course, the superior connectivity between PRAM processors might be used to produce faster
algorithms.
7.9.2 The Power of Concurrency
The CRCW PRAM is a very powerful model. As we show, any Boolean function can be
computed with it in a constant number of steps if a sufficient number of processors is available.
For this reason, the CRCW PRAM is of limited interest: it represents an extreme that does
not reflect reality as we know it. The CREW and EREW PRAMs are more realistic. We
first explore the power of the CRCW and then show that an EREW PRAM can simulate a
p -processor CRCW PRAM with a slowdown by a factor of O (log 2 p ) .
THEOREM 7.9.2 The CRCW PRAM can compute an arbitrary Boolean function in four steps.
Proof GivenaBooleanfunction f :
n
B
→B
, represent it by its disjunctive normal form;
that is, represent it as the OR of its minterms where a minterm is the AND of each literal of
f . (A literal is a variable, x i , or its complement, x i .) Assume that each variable is stored in
a separate location in the common memory.
Given a minterm, we show that it can be computed by a CRCW PRAM in two steps.
Assign one location in the common memory to the minterm and initialize it to the value 1.
Assign one processor to each literal in the minterm. The processor assigned to the j th literal
reads the value of the j th variable from the common memory. If the value of the literal is 0,
this processor writes the value 0 to the memory location associated with the minterm. Thus,
the minterm has value 1 exactly when each literal has value 1. Note that these processors read
concurrently with processors associated with other minterms and may write concurrently if
more than one of their literals has value 0.
Now assume that a common memory location has been reserved for the function itself
and initialized to 0. One processor is assigned to each minterm and if the value of its
minterm is 1, it writes the value 1 in the location associated with the function. Thus, in two
more steps the function f is computed.
Given the power of concurrency, especially as applied to writing, we now explore the cost
in performance of not allowing concurrency, whether in reading or writing.
THEOREM 7.9.3 A p -processor priority CRCW PRAM can be simulated by a p -processor EREW
PRAM with a slowdown by a factor equal to the time to sort p elements on this machine. Conse-
quently, this simulation can be done by a normal algorithm with a slowdown factor of O (log 2 p ) .
Proof The j th EREW PRAM processor simulates a memory access by the j th CRCW
PRAM processor by first writing into a special location, M j ,apair ( a j , j ) indicating that
processor j wishes to access (read or write) location a j . If processors are writing to common
memory, the value to be written is attached to this pair. If processors are reading from
common memory, a return message containing the requested value is provided. If a processor
chooses not to access any location, a dummy address larger than all other addresses is used for
Search WWH ::




Custom Search