Hardware Reference
In-Depth Information
30.2 The Current HPC Storage Model
Considering the high cost (during development as well as at runtime) of
implementing full POSIX I/O compliance in a distributed environment, it
should not come as a surprise that some file systems instead aim to be mostly
compliant. For example, the NFS client emulates the common unlink-after-
open approach of creating temporary files by renaming the file with the goal
of later removing those files (something which does not always succeed). At-
tribute consistency is another area where NFS is not fully compliant. In the
default mode, client-side caching is used to improve performance and reduce
server load. However, this means that full POSIX attribute consistency is not
provided. In addition, there is no guaranteed, portable method to enforce con-
sistency. Some of these issues are being addressed in newer versions (v4) of
the NFS protocol.
In many cases, the POSIX semantics are unnecessarily strict, and conse-
quently most applications continue to function correctly even on these mostly
POSIX-compliant file systems. Thus the current HPC storage model aims to
work around the POSIX shortcomings by adjusting or breaking POSIX where
necessary and leveraging new software layers to optimize I/O before it hits
the POSIX interface. Simultaneously, new storage concepts are being deployed
below the POSIX API that have potential for larger benefits to HPC storage.
30.2.1 The POSIX HPC I/O Extensions
POSIX HPC I/O extensions [9] were designed to improve performance
of POSIX I/O in large-scale HPC environments. Software running on these
systems differs from most other software in that on HPC systems, many pro-
cesses, distributed over many nodes, work collectively on a problem. Specifi-
cally, focusing on I/O operations, this means many processes on many nodes
are opening the same file(s) concurrently. Since HPC applications tend to be
more synchronized as well, often all of the operations performed on these files
(such as open) will be issued within a short time interval, leading to very high
and bursty metadata workloads on the file system.
Since POSIX file handles are only valid on the local node, in a distributed
environment|despite accessing the same le|each node is required to open
the file. This causes each individual node to traverse the directory hierarchy
to locate the requested file, causing high metadata overhead at the (remote)
file system. The POSIX HPC extensions seek to reduce this load by allowing
a single node to open the file and then export some representation of the
resulting file handle (for example, containing a direct pointer to the enclosing
directory) to other nodes ( openg function), which then convert the exported
handle directly to a file handle ( sutoc function) without having to perform a
full open call.
 
Search WWH ::




Custom Search