Information Technology Reference
In-Depth Information
parallel algorithms and architectures. They include the topics by Akl [ 16 ], Bertsekas and
Tsitsiklis [ 38 ], Gibbons and Spirakis [ 113 ], JaJ´a[ 148 ], Leighton [192], Quinn [265], and Reif
[ 277 ]. In addition, the survey article by Karp and Ramachandran [ 161 ] gives an overview
of parallel algorithmic methods. References to results on circuit complexity can be found in
Chapters 2 , 6 ,and 9 .
Flynn introduced the taxonomy of parallel computers that carries his name [ 102 ]. The
data-parallel style of computing was anticipated in the APL [ 146 ] and FP programming lan-
guages [ 26 ] as well as by Preparata and Vuillemin [ 262 ] in their study of parallel algorithms
for networked machines. It was developed as the style of choice for programming the Connec-
tion Machine [ 133 ]. (See also the topics by Hatcher and Quinn [ 129 ] and Blelloch [ 45 ]on
data-parallel computing.) The simulation of the MIMD computer by a SIMD one given in
Section 7.3.1 is due to Wloka [ 365 ].
Amdahl's Law [ 21 ] and Brent's principle [ 58 ] are widely cited; the latter is used extensively
to design efficient parallel algorithms.
Systolic algorithms for convolution, matrix multiplication, and the fast Fourier transform
are given by Kung and Leiserson [ 180 ](seealso[ 181 ]). Odd-even transposition sort is de-
scribed by Knuth [ 170 ]. The lower bound on the time to multiply two matrices given in
Theorem 7.5.3 is due to Gentleman [ 112 ]. The shuffle network was introduced by Stone
[ 318 ].
Preparata and Vuillemin [ 262 ] give normal algorithms for a variety of problems (including
that for shifting in Section 7.7.3 ) and introduce the cube-connected cycles machine. They also
give embeddings of fully normal algorithms into linear arrays and meshes. Dekel, Nassimi, and
Sahni [ 85 ] developed the fast algorithm for matrix multiplication on the hypercube described
in Section 7.7.7 .
Batcher [ 29 ] introduced odd-even and bitonic sorting methods and noted that they could
be used for routing messages in networks. Beneˇs[ 36 ] is the author of the Beneˇspermutation
network.
Variants of the PRAM were introduced by Fortune and Wyllie [ 103 ], Goldschlager [ 118 ],
Savitch and Stimson [ 298 ] as generalizations of the idealized RAM model of Cook and Reck-
how [ 77 ]. The method given in Theorem 7.9.3 to simulate a CRCW PRAM on an EREW
PRAM is due to Eckstein [ 95 ]andVishkin[ 353 ]. Simulations of PRAMs on networked com-
puters have been developed by Mehlhorn and Vishkin [ 221 ], Upfal [ 340 ], Upfal and Wigder-
son [341], Karlin and Upfal [158], Alt, Hagerup, Mehlhorn, and Preparata [ 19 ], and Ranade
[ 267 ]. Cypher and Plaxton [ 84 ] have developed a deterministic O (log p log log p ) -step sort-
ing algorithm for the hypercube. However, it is superior to Batcher's algorithm only for very
large and impractical values of p .
The bulk synchronous parallel (BSP) model [ 348 ] has been proposed as a bridging model
between the needs of programmers and parallel machines. The LogP model [ 83 ] is offered as
a more realistic variant of the BSP model. Juurlink and Wijshoff [ 154 ]andBilardi,Herley,
Pietracaprina, Pucci, and Spirakis [ 39 ] report empirical evidence that the BSP and LogP models
are about equally good as predictors of performance on real parallel computers.
Search WWH ::




Custom Search