Darshan - High Performance Parallel I/O

Hardware Reference

In-Depth Information

studies. Previous system-wide workload studies [9] were very influential in

HPC I/O research but no longer reflected the scale, architecture, and scien-

tific application diversity of present-day systems. Collecting data on large-

scale present-day systems required the development of ecient, non-intrusive

instrumentation methods. This led directly to the following core design goals

for Darshan: transparent integration with the user environment and negligible

impact on application performance, and reliability.

Darshan operates in the user-space as an interposition library in order to

collect per-application statistics without source code modifications. As with

many other HPC profiling tools, Darshan leverages the MPI profiling interface

in conjunction with either link-time wrappers for statically linked executables

or preloaded libraries for dynamically linked executables. Static instrumen-

tation can be enabled system-wide using MPI compiler script functionality,

while dynamic instrumentation can be enabled system-wide using environ-

ment variables. End users do not need to change their work flow in either

case.

The Darshan function call wrappers intercept POSIX and MPI-IO func-

tions, as well as a few key HDF5 and PNetCDF functions. The wrappers are

used to gather information, such as operation counters (such as open , read ,

write , stat , and mmap ); datatypes and hint usage; access patterns in terms of

alignment; sequentiality; access size; and timing information including cumu-

lative I/O time and intervals of I/O activity. (A full description of counters

can be found in the Darshan documentation [5].) Darshan does not issue any

communication or storage operations to manage characterization data while

the application is running. Each process is instrumented independently using

a bounded amount of memory. When the application shuts down, the re-

sults are then aggregated, compressed, and stored persistently. Darshan uses

a combination of MPI reduction operations, collective I/O, and parallel Zlib

compression to reduce overhead and minimize log file size.

The command line utility component of Darshan includes tools to parse

and analyze log files produced by the runtime library. Figure 27.1 shows an

example of output produced by darshan-job-summary , a utility that sum-

marizes the I/O behavior of a job. This example was chosen from production

logs captured on the Mira IBM Blue Gene/Q system operated by the Argonne

Leadership Computing Facility (ALCF). The \I/O Operation Counts" graph

in the upper right corner indicates that the MPI-IO collective buffering opti-

mization [10] was enabled; there is a large discrepancy between the number

of MPI-IO collective write calls and the number of POSIX write calls. The

\Most Common Access Sizes" table conrms that the majority of the POSIX

write operations were 16 MB in size, which corresponds to the collective buffer

size used by MPI-IO. The bottom graph indicates that the application was

divided into subsets of processes that wrote data in different time intervals.

The Darshan command line utilities also include tools to anonymize iden-

tifying information, such as file names and executable names, within log files.

This capability makes it possible to release Darshan characterization data to

High Performance Parallel I/O

Search WWH ::

Custom Search

Home