Hardware Reference
In-Depth Information
In the exotic world of national laboratory computing, the bottom line is per-
formance. In this respect, BlueGene/P wins, 1000 TF/sec to 124 TF/sec, but Red
Storm was designed to be expandable, so by throwing more Opterons at the prob-
lem, Sandia could probably up its performance significantly. IBM could respond
by cranking the clock up a bit (850 MHz is not really pushing the state-of-the-art
very hard). In short, MPP supercomputers have not even come close to any physi-
cal limits yet and will continue growing for years to come.
8.4.3 Cluster Computing
The other style of multicomputer is the cluster computer (Anderson et al.,
1995, and Martin et al., 1997). It typically consists of hundreds or thousands of
PCs or workstations connected by a commercially available network board. The
difference between an MPP and a cluster is analogous to the difference between a
mainframe and a PC. Both have a CPU, both have RAM, both have disks, both
have an operating system, and so on. The mainframe just has faster ones (except
maybe the operating system). Yet qualitatively they feel different and are used and
managed differently. This same difference holds for MPPs vs. clusters.
Historically, the key element that made MPPs special was their high-speed
interconnect, but the recent arrival of commercial, off-the-shelf, high-speed
interconnects has begun to close the gap. All in all, clusters are likely to drive
MPPs into ever tinier niches, just as PCs have turned mainframes into esoteric spe-
cialty items. The main niche for MPPs is high-budget supercomputers, where peak
performance is everything and if you have to ask the price you cannot afford one.
While many kinds of clusters exist, two species dominate: centralized and
decentralized. A centralized cluster is a cluster of workstations or PCs mounted in
a big rack in a single room. Sometimes they are packaged in a much more com-
pact way than usual to reduce physical size and cable length. Typically, the ma-
chines are homogeneous and have no peripherals other than network cards and
possibly disks. Gordon Bell, the designer of the PDP-11 and VAX, has called such
machines headless workstations (because they have no owners). We were
tempted to call them headless COWs, but feared such a term would gore too many
holy cows, so we refrained.
Decentralized clusters consist of the workstations or PCs spread around a
building or campus. Most of them are idle many hours a day, especially at night.
Usually, these are connected by a LAN. Typically, they are heterogeneous and
have a full complement of peripherals, although having a cluster with 1024 mice is
really not much better than a cluster with 0 mice. Most importantly, many of them
have owners who have emotional attachments to their machines and tend to frown
upon some astronomer trying to simulate the big bang on theirs. Using idle
workstations to form a cluster invariably means having some way to migrate jobs
off machines when their owners want to reclaim them. Job migration is possible
but adds software complexity.
 
 
Search WWH ::




Custom Search