Information Technology Reference
In-Depth Information
7.1 Parallel Computational Models
A parallel computer is any computer that can perform more than one operation at time.
By this definition almost every computer is a parallel computer. For example, in the pursuit
of speed, computer architects regularly perform multiple operations in each CPU cycle: they
execute several microinstructions per cycle and overlap input and output operations ( I/O )(see
Chapter 11 ) with arithmetic and logical operations. Architects also design parallel computers
that are either several CPU and memory units attached to a common bus or a collection of
computers connected together via a network. Clearly parallelism is common in computer
science today.
However, several decades of research have shown that exploiting large-scale parallelism is
very hard. Standard algorithmic techniques and their corresponding data structures do not
parallelize well, necessitating the development of new methods. In addition, when parallelism
is sought through the undisciplined coordination of a large number of tasks, the sheer number
of simultaneous activities to which one human mind must attend can be so large that it is
often difficult to insure correctness of a program design. The problems of parallelism are
indeed daunting.
Small illustrations of this point are seen in Section 2.7.1 , which presents an O (log n ) -step,
O ( n ) -gate addition circuit that is considerably more complex than the ripple adder given in
Section 2.7 . Similarly, the fast matrix inversion straight-line algorithm of Section 6.5.5 is more
complex than other such algorithms (see Section 6.5 ).
In this chapter we examine forms of parallelism that are more coarse-grained than is typ-
ically found in circuits. We assume that a parallel computer consists of multiple processors
and memories but that each processor is primarily serial. That is, although a processor may
realize its instructions with parallel circuits, it typically executes only one or a small number of
instructions simultaneously. Thus, most of the parallelism exhibited by our parallel computer
is due to parallel execution by its processors.
We also describe a few programming styles that encourage a parallel style of programming
and offer promise for user acceptance. Finally, we present various methods of analysis that
have proven useful in either determining the parallel time needed for a problem or classifying
a problem according to its need for parallel time.
Given the doubling of CPU speed every two or three years, one may ask whether we can't
just wait until CPU performance catches up with demand. Unfortunately, the appetite for
speed grows faster than increases in CPU speed alone can meet. Today many problems, es-
pecially those involving simulation of physical systems, require teraflop computers (those per-
forming 10 12 floating-point operations per second ( FLOPS )) but it is predicted that petaflop
computers (performing 10 15 FLOPS) are needed. Achieving such high levels of performance
with a handful of CPUs may require CPU performance beyond what is physically possible at
reasonable prices.
7.2 Memoryless Parallel Computers
The circuit is the premier parallel memoryless computational model: input data passes through
a circuit from inputs to outputs and disappears. A circuit is described by a directed acyclic
graph in which vertices are either input or computational vertices. Input values and the re-
sults of computations are drawn from a set associated with the circuit. (In the case of logic
Search WWH ::




Custom Search