PARALLEL COMPUTER ARCHITECTURES - Structured Computer Organization

Hardware Reference

In-Depth Information

Parallelism can be introduced at various levels. At the lowest level, it can be

added to the CPU chip, through pipelining and superscalar designs with multiple

functional units. It can also be added by having very long instruction words with

implicit parallelism. Special features can be added to a CPU to allow it to handle

multiple threads of control at once. Finally, multiple CPUs can be put together on

the same chip. Together, these features can pick up perhaps a factor of 10 in per-

formance over purely sequential designs.

At the next level, extra CPU boards with additional processing capacity can be

added to a system. Usually, these plug-in CPUs have specialized functions, such

as network packet processing, multimedia processing, or cryptography. For spe-

cialized applications, they can also gain a factor of perhaps 5 to 10.

However, to win a factor of a hundred or a thousand or a million, it is neces-

sary to replicate entire CPUs and to make them all work together efficiently. This

idea leads to large multiprocessors and multicomputers (cluster computers). Need-

less to say, hooking up thousands of processors into a big system leads to its own

problems that need to be solved.

Finally, it is now possible to lash together entire organizations over the Internet

to form very loosely coupled compute grids. These systems are only starting to

emerge, but have interesting potential for the future.

When two CPUs or processing elements are close together, have a high band-

width and low delay between them, and are computationally intimate, they are said

to be tightly coupled . Conversely, when they are far apart, have a low bandwidth

and high delay and are computationally remote, they are said to be loosely cou-

pled . In this chapter we will look at the design principles for these various forms

of parallelism and study a variety of examples. We will start with the most tightly

coupled systems, those that use on-chip parallelism, and gradually move to more

and more loosely coupled systems, ending with a few words on grid computing.

This spectrum is crudely illustrated in Fig. 8-1.

The whole issue of parallelism, from one end of the spectrum to the other, is a

hot topic of research. Accordingly, many references are given in this chapter, pri-

marily to recent papers on the subject. Many conferences and journals publish

papers on the subject as well and the literature is growing rapidly.

8.1 ON-CHIP PARALELLISM

One way to increase the throughput of a chip is to have it do more things at the

same time. In other words, exploit parallelism. In this section, we will look at

some of the ways of achieving speed-up through parallelism at the chip level, in-

cluding instruction-level parallelism, multithreading, and putting more than one

CPU on the chip. These techniques are quite different, but each helps in its own

way. In all cases the idea is to get more activity going at the same time.

Search WWH ::

Custom Search

Home