Integrated Performance Monitoring - High Performance Parallel I/O

Hardware Reference

In-Depth Information

FIGURE 26.6: Histograms of MADBENCH parallel reads (same transaction

size in all cases) before and after tuning the parallel file system. Slow reads

(long timings) have been mitigated.

leave HPC system monitoring questions largely aside here. By profiling the

application's POSIX layer alone, a great deal of le system health data can

be gathered.

26.2.3 Buffer Size

The choice of transaction (buffer) size for I/O in many codes is a compli-

cated function of concurrency, input/problem specifics, and file syetem tun-

ables. In general, larger I/Os will show better performance but collective I/O

(MPI-IO collectives), I/O middleware (such as netCDF and HDF5), and other

application specifics can obscure the actual transaction sizes behind higher-

level I/O operations. Obtaining a transaction size profile from IPM is straight-

forward and in many cases reveals the general overall I/O picture. A standard

performance engineering practice is to aggregate I/O where possible to in-

crease the granularity of the larger block sizes that HPC file systems are often

tuned for.

IPM produces an inventory of buffer sizes during execution for I/O calls

along with their minimum, maximum, and average times to completion, along

with the number of transactions. These buffer sizes may easily be compared to

the file system block size to determine if the I/O is latency-limited or making

full use of the file system bandwidth.

While we state the importance of buffer size in modern HPC workloads,

this is dicult in the context of single application runs. For this reason we

suggest that the IPM approach be applied in the broader context of HPC

workloads.

Search WWH ::

Custom Search

Home