Los Alamos National Laboratory - High Performance Parallel I/O

Hardware Reference

In-Depth Information

FIGURE 6.4: PLFS internal data flow diagram. The diagram shows that the

user thinks data is being written into a single file, but the actual data is being

striped across many files across many storage resources in parallel. [Image

courtesy of John Bent (LANL).]

hundreds of thousands of small files. PLFS can map N-to-1 patterns to N-to-N

or N-to-M files where M<N. Furthermore, because PLFS is creating lots of

smaller files unbeknownst to the user, it has the ability to place the small

container files into different directories or even different file systems, giving

PLFS the ability to scale namespace operations and data operations across

multiple metadata servers, as well as across multiple file systems. PLFS has

a mode that allows user-initiated N-to-N trac to be spread across multiple

metadata servers or file systems, and a mode that allows multiple small files

to be put into a single container as well. This remapping of trac and break-

ing the dependence between processes allows the N-to-1 dominant code to get

good performance on Panasas and Lustre.

The PLFS speed-up chart in Figure 6.5 depicts speed-ups for various N-

to-1 applications using PLFS. This ability to remap I/O patterns is a useful

mechanism|so useful that PLFS technology is being used in the DOE Exa-

scale Storage Fast Forward Effort, which started in 2012. This effort is pro-

totyping a next-generation I/O stack for exascale-class computing later this

decade.

Search WWH ::

Custom Search

Home