Information Technology Reference
In-Depth Information
parallel algorithms and architectures. They include the topics by Akl [
16
], Bertsekas and
Tsitsiklis [
38
], Gibbons and Spirakis [
113
], JaJ´a[
148
], Leighton [192], Quinn [265], and Reif
[
277
]. In addition, the survey article by Karp and Ramachandran [
161
] gives an overview
of parallel algorithmic methods. References to results on circuit complexity can be found in
Chapters
2
,
6
,and
9
.
Flynn introduced the taxonomy of parallel computers that carries his name [
102
]. The
data-parallel style of computing was anticipated in the APL [
146
] and FP programming lan-
guages [
26
] as well as by Preparata and Vuillemin [
262
] in their study of parallel algorithms
for networked machines. It was developed as the style of choice for programming the Connec-
tion Machine [
133
]. (See also the topics by Hatcher and Quinn [
129
] and Blelloch [
45
]on
data-parallel computing.) The simulation of the MIMD computer by a SIMD one given in
Section
7.3.1
is due to Wloka [
365
].
Amdahl's Law [
21
] and Brent's principle [
58
] are widely cited; the latter is used extensively
to design efficient parallel algorithms.
Systolic algorithms for convolution, matrix multiplication, and the fast Fourier transform
are given by Kung and Leiserson [
180
](seealso[
181
]). Odd-even transposition sort is de-
scribed by Knuth [
170
]. The lower bound on the time to multiply two matrices given in
Theorem
7.5.3
is due to Gentleman [
112
]. The shuffle network was introduced by Stone
[
318
].
Preparata and Vuillemin [
262
] give normal algorithms for a variety of problems (including
that for shifting in Section
7.7.3
) and introduce the cube-connected cycles machine. They also
give embeddings of fully normal algorithms into linear arrays and meshes. Dekel, Nassimi, and
Sahni [
85
] developed the fast algorithm for matrix multiplication on the hypercube described
in Section
7.7.7
.
Batcher [
29
] introduced odd-even and bitonic sorting methods and noted that they could
be used for routing messages in networks. Beneˇs[
36
] is the author of the Beneˇspermutation
network.
Variants of the PRAM were introduced by Fortune and Wyllie [
103
], Goldschlager [
118
],
Savitch and Stimson [
298
] as generalizations of the idealized RAM model of Cook and Reck-
how [
77
]. The method given in Theorem
7.9.3
to simulate a CRCW PRAM on an EREW
PRAM is due to Eckstein [
95
]andVishkin[
353
]. Simulations of PRAMs on networked com-
puters have been developed by Mehlhorn and Vishkin [
221
], Upfal [
340
], Upfal and Wigder-
son [341], Karlin and Upfal [158], Alt, Hagerup, Mehlhorn, and Preparata [
19
], and Ranade
[
267
]. Cypher and Plaxton [
84
] have developed a deterministic
O
(log
p
log log
p
)
-step sort-
ing algorithm for the hypercube. However, it is superior to Batcher's algorithm only for very
large and impractical values of
p
.
The bulk synchronous parallel (BSP) model [
348
] has been proposed as a bridging model
between the needs of programmers and parallel machines. The LogP model [
83
] is offered as
a more realistic variant of the BSP model. Juurlink and Wijshoff [
154
]andBilardi,Herley,
Pietracaprina, Pucci, and Spirakis [
39
] report empirical evidence that the BSP and LogP models
are about equally good as predictors of performance on real parallel computers.
Search WWH ::
Custom Search