Parallel-NetCDF - High Performance Parallel I/O

Hardware Reference

In-Depth Information

FIGURE 15.1: Parallel-NetCDF sits just below applications in the I/O soft-

ware stack. It provides a more application-oriented interface to the more gen-

eral and complex MPI-IO library. Applications can describe I/O needs in

terms of multi-dimensional arrays. Parallel-NetCDF turns these requests into

MPI-IO collective I/O operations.

simulations predict and reality bears out shrinks, the amount of data produced

by these simulations grows. In 2012, DOE INCITE applications for time on

high-end computing platforms routinely predicted needing terabytes of data.

Second, hard drives double in capacity every 18 months, but the performance

of a hard drive does not match that pace. In order to achieve high storage

rates, many devices must be harnessed in parallel. The computational scientist

looking to produce or analyze terabytes of data needs some way to manage

parallelism in the I/O layer.

Parallelism in the I/O layer brings its own challenges. In order to provide

high performance, storage systems deploy a large number of disks, servers, and

storage links. Applications could operate directly upon these parallel storage

systems. For the sake of developer productivity and I/O performance, how-

ever, I/O libraries exist to provide the abstractions and optimizations compu-

tational scientists need.

Stepping back from the topic of performance briefly and taking a broader

view of the role of high-level I/O libraries in the life of a modern computa-

tional scientist, scientists operate in collaborations, exchanging datasets with

other scientists conducting experiments on machines with different architec-

tural characteristics. Raw binary files or custom file formats complicate the

collaboration story. Instead, standard file formats mean every scientist can

read data produced by any other scientists. Portability in file formats means

no matter the byte-endianness or word size of a machine, the data will al-

ways look the same. One can also imagine \portability over time": these I/O

libraries provide routines to annotate the stored data, or the entire file itself.

These annotations help put the data in context, describing when the data was

produced, by what application, or something as mundane yet important as

what units the data are in.

Search WWH ::

Custom Search

Home