Hardware Reference
In-Depth Information
FIGURE 26.6: Histograms of MADBENCH parallel reads (same transaction
size in all cases) before and after tuning the parallel file system. Slow reads
(long timings) have been mitigated.
leave HPC system monitoring questions largely aside here. By profiling the
application's POSIX layer alone, a great deal of le system health data can
be gathered.
26.2.3 Buffer Size
The choice of transaction (buffer) size for I/O in many codes is a compli-
cated function of concurrency, input/problem specifics, and file syetem tun-
ables. In general, larger I/Os will show better performance but collective I/O
(MPI-IO collectives), I/O middleware (such as netCDF and HDF5), and other
application specifics can obscure the actual transaction sizes behind higher-
level I/O operations. Obtaining a transaction size profile from IPM is straight-
forward and in many cases reveals the general overall I/O picture. A standard
performance engineering practice is to aggregate I/O where possible to in-
crease the granularity of the larger block sizes that HPC file systems are often
tuned for.
IPM produces an inventory of buffer sizes during execution for I/O calls
along with their minimum, maximum, and average times to completion, along
with the number of transactions. These buffer sizes may easily be compared to
the file system block size to determine if the I/O is latency-limited or making
full use of the file system bandwidth.
While we state the importance of buffer size in modern HPC workloads,
this is dicult in the context of single application runs. For this reason we
suggest that the IPM approach be applied in the broader context of HPC
workloads.
 
Search WWH ::




Custom Search