Hardware Reference
In-Depth Information
FIGURE 26.5: Steps required to improve performance of writes in HDF-based
MPI codes.
ftruncate optimization removed a POSIX ftruncate call from the HDF
code, which caused negligible problems at smaller concurrencies but quickly
became a bottleneck in the neighborhood of 1024 tasks. In general, the nature
of the optimizations is to re-organize the I/O into forms that match the file
system's optimal block size and to align le osets along those blocks.
26.2.2 MADBENCH and File System Health
I/O performance is a characteristic of applications and the shared resources
on which the applications run. In some cases, \out" performance is not a spe-
cific concern of the application performance but instead at the file system of
the HPC system level. In March of 2009 reports of file system performance
loss on Franklin (a Cray XT4 at NERSC) were investigated using IPM. This
was not a controlled experiment but I/O performance profiles were measured
as steps were taken to improve the I/O rates by adaptation of the file system
layout (number of OSS I/O nodes) and upgrades to the file system software
versions. Figure 26.6 looks inside the I/O performance before and after those
changes and shows the improvement in particular in I/O read rates in the
out-of-core MADBENCH solver code. In this case, monitoring I/O does not
fully explain the cause of performance loss but does point to the source of per-
formance loss in the application (the read step) and also improves confidence
in understanding the impact of steps to improve it.
There is a great deal to be learned from the statistical properties of I/O
transaction times. Viewing the file system as a black box, one can develop
strategies around concurrency and I/O middleware solely on the basis of the
shifts in performance. A detailed mechanistic understanding of the file sys-
tem is also a useful approach to engineering better I/O performance, but we
 
Search WWH ::




Custom Search