Hardware Reference
In-Depth Information
FIGURE 15.1: Parallel-NetCDF sits just below applications in the I/O soft-
ware stack. It provides a more application-oriented interface to the more gen-
eral and complex MPI-IO library. Applications can describe I/O needs in
terms of multi-dimensional arrays. Parallel-NetCDF turns these requests into
MPI-IO collective I/O operations.
simulations predict and reality bears out shrinks, the amount of data produced
by these simulations grows. In 2012, DOE INCITE applications for time on
high-end computing platforms routinely predicted needing terabytes of data.
Second, hard drives double in capacity every 18 months, but the performance
of a hard drive does not match that pace. In order to achieve high storage
rates, many devices must be harnessed in parallel. The computational scientist
looking to produce or analyze terabytes of data needs some way to manage
parallelism in the I/O layer.
Parallelism in the I/O layer brings its own challenges. In order to provide
high performance, storage systems deploy a large number of disks, servers, and
storage links. Applications could operate directly upon these parallel storage
systems. For the sake of developer productivity and I/O performance, how-
ever, I/O libraries exist to provide the abstractions and optimizations compu-
tational scientists need.
Stepping back from the topic of performance briefly and taking a broader
view of the role of high-level I/O libraries in the life of a modern computa-
tional scientist, scientists operate in collaborations, exchanging datasets with
other scientists conducting experiments on machines with different architec-
tural characteristics. Raw binary files or custom file formats complicate the
collaboration story. Instead, standard file formats mean every scientist can
read data produced by any other scientists. Portability in file formats means
no matter the byte-endianness or word size of a machine, the data will al-
ways look the same. One can also imagine \portability over time": these I/O
libraries provide routines to annotate the stored data, or the entire file itself.
These annotations help put the data in context, describing when the data was
produced, by what application, or something as mundane yet important as
what units the data are in.
 
Search WWH ::




Custom Search