Hardware Reference
In-Depth Information
FIGURE 14.1 (See color insert): PLFS shared file mode transforms multiple
writes to a shared file into streams of data sent to multiple subfiles on the
underlying storage system(s). Not shown is the internal PLFS metadata used
to reconstruct the file. [Image courtesy of John Bent (EMC).]
many applications naturally have partitions of a large distributed data struc-
ture which are poorly matched to the block alignment of many storage systems
and therefore lose performance to various locks and serialization bottlenecks
inherent in parallel file systems.
By decoupling the concurrent writes, PLFS sends data streams to the un-
derlying storage systems, which avoids these locks and bottlenecks. The basic
mechanism is that PLFS first creates a PLFS container and then stores all
the individual subfiles in this container as well as the metadata necessary to
re-create the logical file. Functionally, the container is very similar to how
inodes are used in almost all file systems since the Berkeley Fast File Sys-
tem [7]. When the user requests data from the file, PLFS consults the meta-
data within the container to resolve which subfile(s) contain the requested
data and then reads from the subfile(s) to return the data to the reading
application.
Note that at no point is the application aware of this transformation:
all operations on the shared file work functionally, exactly the same as if
PLFS was not present. One concern in PLFS, however, is that the amount of
PLFS metadata can grow to challenging sizes; this concern is addressed by
discovering hidden structure within seemingly unstructured I/O [6].
 
Search WWH ::




Custom Search