Hardware Reference
In-Depth Information
Memory consistency is not a done deal. Researchers are still proposing new
models (Naeem et al., 2011, Sorin et al., 2011, and Tu et al., 2010).
8.3.3 UMA Symmetric Multiprocessor Architectures
The simplest multiprocessors are based on a single bus, as illustrated in
Fig. 8-26(a). Two or more CPUs and one or more memory modules all use the
same bus for communication. When a CPU wants to read a memory word, it first
checks to see whether the bus is busy. If the bus is idle, the CPU puts the address
of the word it wants on the bus, asserts a few control signals, and waits until the
memory puts the desired word on the bus.
Private memory
Shared
memory
Shared memory
CPU
CPU
M
CPU
CPU
M
CPU
CPU
M
Cache
Bus
(a)
(b)
(c)
Figure 8-26. Three bus-based multiprocessors. (a) Without caching. (b) With
caching. (c) With caching and private memories.
If the bus is busy when a CPU wants to read or write memory, the CPU just
waits until the bus becomes idle. Herein lies the problem with this design. With
two or three CPUs, contention for the bus will be manageable; with 32 or 64 it will
be unbearable. The system will be totally limited by the bandwidth of the bus, and
most of the CPUs will be idle most of the time.
The solution is to add a cache to each CPU, as depicted in Fig. 8-26(b). The
cache can be inside the CPU chip, next to the CPU chip, on the processor board, or
some combination of all three. Since many reads can now be satisfied out of the
local cache, there will be much less bus traffic, and the system can support more
CPUs. Thus caching is a big win here. However, as we shall see in a moment,
keeping the caches consistent with one another is not trivial.
Yet another possibility is the design of Fig. 8-26(c), in which each CPU has not
only a cache but also a local, private memory which it accesses over a dedicated
(private) bus. To use this configuration optimally, the compiler should place all the
program text, strings, constants and other read-only data, stacks, and local vari-
ables in the private memories. The shared memory is then used only for writable
shared variables. In most cases, this careful placement will greatly reduce bus traf-
fic, but it does require active cooperation from the compiler.
 
 
Search WWH ::




Custom Search