Information Technology Reference
In-Depth Information
From Theorem 8.5.8 we know that the log-space DTM M generating C ( M , n ) does
not execute more than p ( n ) steps, p ( n ) a polynomial in n .Since p ( n ) is assumed proper,
we can assume without loss of generality that M executes p ( n ) steps on all inputs of length
n .Thus, M has exactly
= O ( p ( n )) configurations.
The input string w is placed in the first n locations of the otherwise blank common
memory. To determine the length of the input, for each i the i th CREW PRAM processor
examines the words in locations i and i + 1. If location i + 1 is blank but location i is not,
i = n .The n th processor then computes p ( n ) in O (log 2 n ) serial steps (see Problem 8.2 )
andplacesitincommonmemory.
The circuit C ( M , n ) is constructed from representations of next-state mappings, one
mapping for every state transition. Since there are no external inputs to M (all inputs are
recorded on the input tape before the computation begins), all next-state mappings are the
same. As shown in Section 3.2 , let this one mapping be defined by a Boolean
|
Q
|
|
matrix M Δ whose rows and columns are indexed by configurations of M .Aconfiguration
of M is a tuple ( q , h 1 , h 2 , h 3 , x ) in which q is the current state, h 1 , h 2 ,and h 3 are the
positions of the heads on the input, output, and work tapes, respectively, and x is the cur-
rent contents of the work tape. Since M computes a log-space transformation, it executes a
polynomial number of steps. Thus, each configuration has length O (log n ) .Consequently,
a single CREW PRAM can determine in O (log n ) time whether an entry in row r and
column c ,where r and c are associated with configurations, has value 0 or 1. For concrete-
ness, assign PRAM processor i to row r and column c of M Δ ,where r =
|
Q
|×|
Q
i/p ( n )
and
p ( n ) , quantities that can be computed in O (log 2 n ) steps.
The circuit C ( M , n ) simulating M is obtained via a prefix computation on p ( n ) copies
of the matrix M Δ using matrix multiplication as the associative operator. (See Section 3.2 .)
Once C ( M , n ) has been written into the common memory, it can be evaluated by
assigning one processor per gate and then computing its value as many times as the depth of
C ( M , n ) . This involves a four-phase operation in which the j th processor reads each of the
at most two arguments of the j th gate in the first two phases, computes its value in the third,
and then writes it to common memory in the fourth. This process is repeated as many times
as the depth of the circuit C ( M , n ) , thereby insuring that correct values for gates propagate
throughout the circuit. Again concurrent reads and exclusive writes suffice.
c = i
r
×
These two results (and Problem 8.37 ) imply the result stated below, namely, that the bi-
nary functions computed by circuits with polynomial size and poly-logarithmic depth are the
same as those computed by the CREW PRAM with polynomially many processors and poly-
logarithmic time.
THEOREM 8.14.1 The functions f : B →B computed by circuits of polynomial-size and poly-
logarithmic depth are the same as those computed by the CREW PRAM with a polynomial number
of processors and poly-logarithmic time.
8.14.2 The Parallel Computation Thesis
A deep connection exists between serial space and parallel time. The parallel computation
thesis states that sequential space and parallel time are polynomially related; that is, if there
exists a sequential algorithm that uses space S , then there exists a parallel algorithm using time
p ( S ) for some polynomial p and vice versa. There is strong evidence that this hypothesis holds.
Search WWH ::




Custom Search