Hardware Reference
In-Depth Information
the new technology. Spinning disks also offered new opportunities for direct
access, and B-trees and other structures were invented that made searching,
sampling, and subsetting much easier and faster.
As systems evolved, people employed combinations of these techniques,
mixing and matching particular methods in ways that worked best for the
architecture on which a system resided, and for the applications most likely to
run on the system. In many cases these combinations were provided as different
software layers. One layer might aggregate a collection of data elements into
a buffer, another might compress the data, and another might write it out in
a certain format. The same package might use the same steps in reverse to
read the data. Thus, perhaps, emerged the idea of the I/O stack.
Over time, software packages were created that included some of these
combinations. As often as not, they were motivated by application developers
who didn't want to be bothered with such details, but did want to get as
good performance as they could on their systems. These solutions were made
available in the form of I/O libraries.
12.2 A Recent History of I/O Libraries, by Example
What you will read in the next six chapters is a snapshot covering the last
two decades in the history of I/O and high-end computing systems. These I/O
libraries employ pretty much the same basic principles we have seen for more
than half a century, but in new and clever ways that address today's archi-
tectures and applications, and do their best to anticipate the next generation
of architectures and applications.
MPI-IO anticipated the rapid growth of highly parallel systems and paral-
lel lel systems. The MPI-IO library is a middle layer that provides options to
support the different kinds of I/O applications perform, and hides the details
from applications or their I/O libraries. Although the other I/O libraries in
this section can read and write without using MPI-IO, most of them provide
options that enable applications to take advantage of this powerful tool.
The Parallel Log-structured File System (PLFS) was designed for mas-
sively concurrent checkpointing, but ultimately became an effective way to
adapt a variety of workloads and I/O patterns to a single storage system,
hiding the details from applications. Moreover, PLFS is positioned to support
high performance I/O in the future by showing it can accommodate GPU
processing in the data pipeline, take advantage of burst buffers, and play well
with cloud file systems.
Parallel-NetCDF is perhaps the most user-friendly of the parallel I/O li-
braries because it adapts an existing serial I/O library to parallel I/O without
asking the user to adapt their data model or their format in any way. Because
netCDF is one of the most popular of all scientific data formats, the availabil-
 
Search WWH ::




Custom Search