Hardware Reference
In-Depth Information
three main modes of operation (shared, flat, and small) essentially comparable
to mount time options and several interesting use cases for each mode.
14.2 Design/Architecture
PLFS is mainly designed to run as middleware on the compute nodes
themselves. It can run in the user space of applications using MPI-IO and a
patched MPI library, or in the user space of applications which are ported to
link directly into the PLFS API. The PLFS API closely mirrors the standard
POSIX API; thus porting several applications and synthetic benchmarks have
been straightforward. PLFS is also available as a FUSE file system [1] for
unmodified applications which do not use MPI-IO. Since the FUSE approach
can incur high overhead, there is also an LDPRELOAD interface which brings
PLFS into the user space of unmodified applications [8].
There are three main modes of PLFS which are set in a PLFS configuration
file and defined on a per-path basis. The options for each path define the path
that the user will use (i.e., /mnt/plfs/sharedfile ), the mode of operation,
and the underlying storage system(s) that PLFS will use for the actual storage
of the user data as well as its own metadata. Typically, the underlying storage
system is a globally visible storage system, and the PLFS configuration file is
shared across a set of compute nodes such that each compute node can write
to the same PLFS file(s) and each compute node can read PLFS files written
from a different compute node.
The three main configurations of PLFS are shared file, small file, and
flat file, each of which is intended for different application I/O workloads.
Additionally, there is a burst buffer configuration (which currently works only
in shared file mode) to transparently gain performance benefits from a smaller,
faster storage tier such as flash memory. All three modes support the ability
to use PLFS as an umbrella file system that can distribute workloads across
multiple underlying storage systems to aggregate their bandwidth and utilize
all available metadata servers. Finally, there is support in PLFS to run with
all three modes on top of cloud file systems such as Hadoop.
14.2.1 PLFS Shared File Mode
Shared file mode is the original PLFS configuration [3] and is designed
for highly concurrent writes to a shared file, such as a checkpoint file which
is simultaneously written by all processes within a large parallel application.
The architecture of PLFS shared file mode is shown in Figure 14.1. Note that
the figure shows the PLFS layer as a separate layer; this is accurate from the
perspective of the application but in fact the PLFS software runs on each
compute node. This mode was motivated by the well-known observation that
 
Search WWH ::




Custom Search