PARALLEL COMPUTER ARCHITECTURES - Structured Computer Organization

Hardware Reference

In-Depth Information

Memory consistency is not a done deal. Researchers are still proposing new

models (Naeem et al., 2011, Sorin et al., 2011, and Tu et al., 2010).

8.3.3 UMA Symmetric Multiprocessor Architectures

The simplest multiprocessors are based on a single bus, as illustrated in

Fig. 8-26(a). Two or more CPUs and one or more memory modules all use the

same bus for communication. When a CPU wants to read a memory word, it first

checks to see whether the bus is busy. If the bus is idle, the CPU puts the address

of the word it wants on the bus, asserts a few control signals, and waits until the

memory puts the desired word on the bus.

Private memory

Shared

memory

Shared memory

CPU

M

CPU

M

CPU

M

Cache

Bus

(a)

(b)

(c)

Figure 8-26. Three bus-based multiprocessors. (a) Without caching. (b) With

caching. (c) With caching and private memories.

If the bus is busy when a CPU wants to read or write memory, the CPU just

waits until the bus becomes idle. Herein lies the problem with this design. With

two or three CPUs, contention for the bus will be manageable; with 32 or 64 it will

be unbearable. The system will be totally limited by the bandwidth of the bus, and

most of the CPUs will be idle most of the time.

The solution is to add a cache to each CPU, as depicted in Fig. 8-26(b). The

cache can be inside the CPU chip, next to the CPU chip, on the processor board, or

some combination of all three. Since many reads can now be satisfied out of the

local cache, there will be much less bus traffic, and the system can support more

CPUs. Thus caching is a big win here. However, as we shall see in a moment,

keeping the caches consistent with one another is not trivial.

Yet another possibility is the design of Fig. 8-26(c), in which each CPU has not

only a cache but also a local, private memory which it accesses over a dedicated

(private) bus. To use this configuration optimally, the compiler should place all the

program text, strings, constants and other read-only data, stacks, and local vari-

ables in the private memories. The shared memory is then used only for writable

shared variables. In most cases, this careful placement will greatly reduce bus traf-

fic, but it does require active cooperation from the compiler.

Structured Computer Organization

Search WWH ::

Custom Search

Home