Hardware Reference
In-Depth Information
tracing of I/O operations in large-scale parallel applications, such as LANL-
Trace [3], HPCT-IO [18], IOT [17], etc. Some of the general-purpose tools can
also trace I/O information in additional to MPI calls and CPU activities, such
as FPMPI [2], Jumpshot [22, 15], TAU [20] (see Chapter 25), IPM [16] (see
Chapter 26), STAT [12], etc. Most of these tools produce a log containing the
time series of I/O operations. The logs are useful for understanding the I/O
behavior of each process. However, due to the size of the log, the overhead
of these are usually large, so they are suited for debugging jobs. Darshan
(see Chapter 27) is also an I/O tracing profiler. Darshan only gives aggregate
information over time, so the log files are much smaller than other tools and the
overhead is very low. For this reason, Darshan can be enabled for a compute
system by default to capture the I/O of every job. However, Darshan does not
provide I/O behavior over time, which limits the depth of the I/O knowledge.
24.5 I/O Profiling at NERSC
At NERSC the Darshan tool is used to characterize I/O on the Hop-
per Cray XE6 system [10]. Darshan captures MPI-IO calls using the PMPI
interface [9] and captures POSIX-I/O calls using the GNU linker's wrap ar-
gument [14]. NERSC staff and Darshan developers have worked together to
make Darshan usable to a wide number of users by loading the Darshan mod-
ule by default. A user simply needs to relink his or her application in order to
for Darshan to instrument the application. The primary purpose of profiling
by default is to give users feedback on their I/O behavior, so that they can
improve the I/O eciency to achieve more meaningful calculation within a
certain allocation. Darshan output is collected and put on a website for the
user to see. The Darshan output displayed on the website includes:
I/O size in megabytes read and written,
I/O rate in megabytes per second,
percentage of application time spent in I/O, and
distribution of write and read sizes.
24.5.1 Application Profiling Case Studies
NERSC has used the Darshan tool to identify and help users who may be
performing I/O in a less than optimal way. This section selects three of the
most representative case studies to show how Darshan has been used to find
possible I/O application problems.
 
Search WWH ::




Custom Search