Hardware Reference
In-Depth Information
Parallelism can be introduced at various levels. At the lowest level, it can be
added to the CPU chip, through pipelining and superscalar designs with multiple
functional units. It can also be added by having very long instruction words with
implicit parallelism. Special features can be added to a CPU to allow it to handle
multiple threads of control at once. Finally, multiple CPUs can be put together on
the same chip. Together, these features can pick up perhaps a factor of 10 in per-
formance over purely sequential designs.
At the next level, extra CPU boards with additional processing capacity can be
added to a system. Usually, these plug-in CPUs have specialized functions, such
as network packet processing, multimedia processing, or cryptography. For spe-
cialized applications, they can also gain a factor of perhaps 5 to 10.
However, to win a factor of a hundred or a thousand or a million, it is neces-
sary to replicate entire CPUs and to make them all work together efficiently. This
idea leads to large multiprocessors and multicomputers (cluster computers). Need-
less to say, hooking up thousands of processors into a big system leads to its own
problems that need to be solved.
Finally, it is now possible to lash together entire organizations over the Internet
to form very loosely coupled compute grids. These systems are only starting to
emerge, but have interesting potential for the future.
When two CPUs or processing elements are close together, have a high band-
width and low delay between them, and are computationally intimate, they are said
to be tightly coupled . Conversely, when they are far apart, have a low bandwidth
and high delay and are computationally remote, they are said to be loosely cou-
pled . In this chapter we will look at the design principles for these various forms
of parallelism and study a variety of examples. We will start with the most tightly
coupled systems, those that use on-chip parallelism, and gradually move to more
and more loosely coupled systems, ending with a few words on grid computing.
This spectrum is crudely illustrated in Fig. 8-1.
The whole issue of parallelism, from one end of the spectrum to the other, is a
hot topic of research. Accordingly, many references are given in this chapter, pri-
marily to recent papers on the subject. Many conferences and journals publish
papers on the subject as well and the literature is growing rapidly.
8.1 ON-CHIP PARALELLISM
One way to increase the throughput of a chip is to have it do more things at the
same time. In other words, exploit parallelism. In this section, we will look at
some of the ways of achieving speed-up through parallelism at the chip level, in-
cluding instruction-level parallelism, multithreading, and putting more than one
CPU on the chip. These techniques are quite different, but each helps in its own
way. In all cases the idea is to get more activity going at the same time.
 
 
Search WWH ::




Custom Search