Argonne Leadership Computing Facility - High Performance Parallel I/O

Hardware Reference

In-Depth Information

FIGURE 4.5: The cumulative distribution of file sizes on ALCF project file

systems.

to support large files well, although a workload of a lot of simultaneous file

I/O will generally work because all LUNs will still be utilized.

4.5.1 Case Studies

In 2010, ALCF deployed an I/O profiling tool named Darshan (see Chap-

ter 27), which is lightweight enough to be used in continuous operations. Ap-

plications compiled after ALCF deployed this tool would automatically pick

up the Darshan intercept libraries and began generating log files with statisti-

cal information about the I/O of that application. They analyzed about three

months of data collected to examine what type of I/O was being done at this

time and published the findings in Carns et al. [4]. In this study, they instru-

mented about 30% of the total core hours that ran during those three months.

During that study, Carns et al. looked at common I/O profiles for various

applications. They found that the file-per-rank model is the most popular

method for I/O. This model will utilize all the I/O node resources but will

only scale up well to about 8192 nodes. At this point, metadata overhead

for creating files starts becoming significant. This can be partly mitigated by

pre-creating unique directories for each file before the application run. This

reduces the serialization at the directory level when creating files. Scaling

beyond eight racks typically requires an application to start using shared files.

Either a single shared file may be used or a number of shared files. Shared

files are best handled using some form of aggregation and coordination like

that provided by MPI-IO (or libraries derived on top of it). The MPI ranks

Search WWH ::

Custom Search

Home