Hardware Reference
In-Depth Information
Iota offers two alternative methods for limiting the footprint during trac-
ing, while still collecting complete traces:
subsetting, which restricts tracing to a subset of files specified by a wild-
card pattern in an environment variable;
flushing, which flushes the trace buffer to file at 1 MB intervals, but
requires writing one such file per MPI task. On Lustre file systems, the
trace file is opened with stripe count 1 and stripe size 1 MB. Flushing
has the added benefit that some tracing information may be available
even if the program aborts in the middle of a run.
Unlike Darshan, Iota is not intended for use as an automated, center-
wide collection tool (c.f. Carns et al. [2]). Runs with pathological I/O could
create significant buffering overhead and cause the run to fail; for instance,
an I/O pattern with one billion 1 KB writes from a single MPI task would
create about 32 GB of buffered string data for that task. However, even this
pathological case could be handled by Iota if the flushing mechanism is used.
Thus, Iota is best used for targeted profiling tasks with careful selection of
tracing parameters (for examples of targeted I/O profiling, see Uselton et al.
[5]).
Iota supports both runtime and linktime interposition of POSIX I/O func-
tions in the GNU C library. For runtime interposition of dynamically linked
executables, Iota redefines each function and calls the dylib function with
RTLDNEXT to locate the next (e.g., system) symbol for that function name.
For linktime interposition of dynamically or statically linked executables, Iota
uses the GNU linker's --wrap feature and defines a wrap* variant for each
function.
Iota supports both MPI and non-MPI executables. In MPI mode, ini-
tialization and finalization are accomplished by redefining the MPIInit and
MPIFinalize functions and calling into the standard MPI profiling inter-
face ( PMPIInit and PMPIFinalize ). In non-MPI mode, the GNU linker's
constructor and destructor function attributes provide similar hooks. In
both modes, Iota measures the elapsed time of POSIX I/O calls with the
high-precision gettimeofday timer.
28.2 Success Stories
We have tested Iota at scale with stand-alone I/O kernels from three sci-
entific applications: a climate simulation, the Global Cloud Resolving Model
(GCRM), and a plasma simulation, VORPAL, both described previously by
Howison et al. [4]; and a second plasma simulation, VPIC, described by Byna
et al. [1]. File sizes ranged from 15 GB to 1.5 TB.
 
Search WWH ::




Custom Search