Silo: A General-Purpose API and Scientific Database - High Performance Parallel I/O

Hardware Reference

In-Depth Information

the number of independent I/O pathways available from compute nodes to

the file system. Typically, this number is between 8 and 1024 depending on

problem size and the compute and file system resources involved.

ALE3D groups MPI tasks into N groups and each group is responsible for

creating one of the N files. At any one moment, only one MPI task from each

group has exclusive access to the file. Hence, I/O is serial within a group.

However, because one task in each group is writing to its group's own le,

simultaneously, I/O is parallel across groups. Within a group, access to the

group's le is handled in a round-robin fashion. The rst MPI task in the group

creates the file and then iterates over all domains it has. For each domain, it

creates a sub-directory within the file (e.g., a separate namespace for Silo

objects) and writes all the Silo objects (the main mesh domain, the material

composition of the domain, the mesh variables defined on the domain) to that

directory. It repeats this process for each domain. Then, the first MPI task

closes the Silo file and hands off exclusive access to the next task in the group.

That MPI task opens the file and iterates over all domains in the same way.

Exclusive access to the file is then handed off to the next task. This process,

shown in Figure 21.2, continues until all processors in the group have written

their domains to unique sub-directories in the file.

After all groups have finished writing their Silo files, a final step involves

creating a master Silo file which contains special Silo objects (called multi-

block objects) that point at all the pieces of mesh (domains) scattered about

in the N files.

Setting N to be equal to the number of MPI tasks, results in a file-per-

process configuration, which is typically not recommended for users. However,

some applications do indeed choose to run this way with good results. Alter-

natively, setting N equal to 1 results in effectively serializing the I/O and is

certainly not recommended. For large, parallel runs, there is a sweet spot in

the selection of N which results in peak I/O performance rates. If N is too

large, the I/O subsystem will likely be overwhelmed; setting it too small will

likely underutilize the system resources. This is illustrated in Figure 21.3 for

different numbers of files and MPI task counts.

21.3 MIF and SSF Scalable I/O Paradigms

This approach to using Silo for scalable, parallel I/O was originally de-

veloped in the late 1990s by Rob Neely, a lead software architect on ALE3D

at the time. This approach is sometimes called \Poor Man's Parallel I/O." It

and variations thereof have since been adopted and used productively through

several transitions in orders of magnitude of MPI task counts from hundreds

then to hundreds of thousands today.

Search WWH ::

Custom Search

Home