Information Technology Reference
In-Depth Information
In the category of distributed-memory systems, all processors have a private local
memory that is inaccessible by others. The processors have some form of inter-
connection between them, ranging from dedicated networks with high throughput
and low latency to the relatively slow Ethernet. The processors communicate with
each other by explicitly sending and receiving messages, which are arrays of data
values in a programming language. Two typical categories of distributed-memory
systems are proprietary massively parallel computers and cost-effective PC clusters.
For example, Ethernet-connected serial computers in a computer lab fall into the
latter category. We refer to Fig. 10.1 for a schematic overview of the shared-memory
and distributed-memory parallel architectures.
There are, of course, parallel systems that fall in between the two main categories.
For example, a cluster of SMP machines is a hybrid system. A multicore-based
PC cluster is, strictly speaking, also a hybrid system where memory is distributed
among the PC nodes, while one or several multicore chips share the memory
within each node. Furthermore, the cores inside a node can be inhomogeneous,
e.g., general-purpose graphics processing units can be combined with regular CPU
cores to accelerate parallel computations. For a review of the world's most powerful
parallel computers, we refer the reader to the Top500 List [2].
Note also that parallel computing does not necessarily involve multiple proces-
sors. Actually, modern microprocessors have long exploited hardware parallelism
within one processor, as in instruction pipelining, multiple execution units, and so
on. Development of this compiler-automated parallelism is also part of the reason
why single-CPU computing speed kept up with Moore's law for half a century. How-
ever, the present chapter will only address parallel computations that are enabled by
using appropriate software on multiple processors.
10.2
More About Parallel Computing
A parallel computer provides the technical possibility, but whether or not parallel
computing can be applied to a particular computational problem depends on the
existence of parallelism and how it can be exploited in a form suitable for the parallel
memory
memory
memory
memory
global memory
CPU
CPU
CPU
CPU
memory bus
CPU
CPU
CPU
CPU
network
Fig. 10.1 A schematic layout of the shared-memory ( left ) and distributed-memory ( right ) parallel
architectures
 
Search WWH ::




Custom Search