Overview of I/O Benchmarking - High Performance Parallel I/O

Hardware Reference

In-Depth Information

in a block-based file system while an application that tries to open many files

concurrently may suffer from reduced metadata performance.

There are a number of techniques that application developers and HPC

system administrators can use to assess the performance of an application or

the I/O subsystem. This chapter introduces I/O benchmarking, file system

monitoring, and I/O profiling for these purposes.

24.2 I/O Benchmarking

To measure a system's I/O capabilities, the HPC community has long used

I/O benchmarks to measure performance. Benchmarks vary in complexity and

purpose, but in general, the goal is to mimic a workload in a simple way, or

to test and stress certain aspects of the file system [19].

At HPC centers, I/O benchmarks are used to test and accept new systems.

They are also used to investigate reports of low performance and to find

bottlenecks in the file system.

I/O workloads run at HPC centers include a variety of access patterns

and interfaces [19, 13]. Within the DOE Oce of Science user community,

applications use the POSIX, MPI-IO, HDF5, and netCDF interfaces. Some

of these applications write to and read from single shared files while others

use a file-per-processor format. Furthermore, applications display a variety of

access patterns from large block, append-only writes, to small block bursty

I/O patterns, to reading large input files [21].

Figure 24.1 shows some of the common access patterns that result from

moving data between memory and a file. In the simplest case, the data is

contiguous in both memory and the file. Alternatively, the data could be

contiguous in memory, but not in the file; contiguous in the file, but not in

memory; or contiguous in neither. The reasons for different I/O access pat-

terns in different scientific applications vary widely. A developer may choose

a specific data structure because it is most optimal for a particular computa-

tional algorithm, but may choose a different data layout in the file to facilitate

post-processing and analysis.

With such a variety of I/O patterns, it is dicult for a single I/O bench-

mark to represent a complex multi-user environment, so instead, synthetic I/O

benchmarks are used to test and isolate specific I/O components in a storage

subsystem or to measure a distinct I/O pattern [19]. Many of the existing I/O

benchmarks do not reflect a complex HPC workload, either because they only

test POSIX APIs or because they only measure serial I/O performance.

The IOR [7] benchmark is one of the most flexible benchmarks because

simple input parameters can control different I/O APIs such as POSIX, MPI-

IO HDF5, and Parallel-NetCDF. Parameters also control whether output is

a single shared file or multi-file. Additionally, a user can control the size of

Search WWH ::

Custom Search

Home