Hardware Reference
In-Depth Information
archive in order to understand bandwidth requirements. If parallel file sys-
tems built from disk technology alone are not economical due to the massive
burst bandwidth needed, will parallel archives, which are based on tape, be
economical? HPC sites depend on striping across tapes just like parallel file
systems depend on striping across disks. Just as disks are not economically
priced for bandwidth, neither are tape solutions. Since bandwidth in a tape
environment does not come from the tape media or cartridges (it comes from
the tape drives) and since tape drives are quite expensive compared to disks
(30{100 times the price), there is a distinct possibility that total cost of own-
ership for parallel tape solutions may become more related to tape drive costs
than tape media costs (with bandwidth vs. capacity).
6.3 Workloads and Applications
6.3.1 Applications and Their Use of Storage
Since one of the primary missions of LANL is science-based nuclear
weapons stockpile stewardship, it is no surprise that the bulk of the com-
putation done utilizing the HPC environments at LANL is in support of that
mission. There are two basic categories of applications run at LANL. One
category is single physics science applications, like molecular dynamics and
particle-in-cell materials or plasma-type codes, which are typically a few hun-
dred thousand lines of code and run at all scales. The second category is multi-
physics integrated weapons applications, which are complicated multi-physics
and multi-package applications that are often many million lines of code|
again run at all scales. The bulk of the cycles at LANL are used by the large
multi-physics codes providing science-based decision support for important
nuclear weapon stockpile issues. Because of the dominance of the integrated
weapons codes, LANL is perhaps more of a production-computing site and
less of a computational science experimental site. Further, LANL is heavily
engaged in validation and verification runs to quantify potential errors in cal-
culations to assist the decision support nature of the science work being done.
Many of the capacity runs, ranging from 100 to 10,000 cores, are parameter
sweeps to gain statistical validity for an ensemble of calculations to under-
stand a weapon phenomenon. Typical large runs can be on 1 = 3 to 2 = 3 of the
largest capability machines, lasting for weeks to many months. Typical runs
on capacity systems can last for hours or days, but there can be thousands of
these runs in a logical set of decision support calculations. Because of the na-
ture of its mission, LANL sees users from Los Alamos, Sandia, and Lawrence
Livermore National Laboratories running weapons stockpile stewardship cal-
culations. The number of large-cycle users is small, about 20{30 people. Also
the number of codes is small, around 10{20 codes, but the bulk of the cycles
are used by 4{5 codes.
 
Search WWH ::




Custom Search