Hardware Reference
In-Depth Information
has the same access time to every memory module. In other words, every memory
word can be read as fast as every other memory word. If this is technically impos-
sible, the fastest references are slowed down to match the slowest ones, so pro-
grammers do not see the difference. This is what ''uniform'' means here. This
uniformity makes the performance predictable, an important factor for writing ef-
ficient code.
In contrast, in a NUMA multiprocessor, this property does not hold. Often
there is a memory module close to each CPU and accessing that memory module is
much faster than accessing distant ones. The result is that for performance rea-
sons, it matters where code and data are placed. COMA machines are also nonuni-
form, but in a different way. We will study each of these types and their subcate-
gories in detail later.
The other main category of MIMD machines consists of the multicomputers,
which, unlike the multiprocessors, do not have shared primary memory at the
architectural level. In other words, the operating system on a multicomputer CPU
cannot access memory attached to a different CPU by just executing a LOAD in-
struction. It has to send an explicit message and wait for an answer. The ability of
the operating system to read a distant word by just doing a LOAD is what distin-
guishes multiprocessors from multicomputers. As we mentioned before, even on a
multicomputer, user programs may have the ability to access remote memory by
using LOAD and STORE instructions, but this illusion is supported by the operating
system, not the hardware. This difference is subtle, but very important. Because
multicomputers do not have direct access to remote memory, they are sometimes
called NORMA ( NO Remote Memory Access ) machines.
Multicomputers can be roughly divided into two categories. The first contains
the MPP s( Massively Parallel Processors ), which are expensive supercomputers
consisting of many CPUs tightly coupled by a high-speed proprietary intercon-
nection network. The IBM SP/3 is a well-known commercial example.
The other category consists of regular PCs, workstations, or servers, possibly
rack mounted, and connected by commercial off-the-shelf interconnection technol-
ogy. Logically, there is not much difference, but huge supercomputers costing
many millions of dollars are used differently than networks of PCs assembled by
the users for a fraction of the price of an MPP. These home-brew machines go by
various names, including NOW ( Network of Workstations ), COW ( Cluster of
Workstations ), or sometimes just cluster .
8.3.2 Memory Semantics
Even though all multiprocessors present the CPUs with the image of a single
shared address space, often many memory modules are present, each holding some
portion of the physical memory. The CPUs and memories are often connected by a
complex interconnection network, as discussed in Sec. 8.1.2. Several CPUs may
be attempting to read a memory word at the same time several other CPUs are
 
 
Search WWH ::




Custom Search