Hardware Reference
In-Depth Information
30.1 The POSIX Era
The POSIX I/O standard was developed by the IEEE and is currently de-
ned as part of \IEEE Std 1003.1-1988" [13] (often referred to as \POSIX.1-
1998." There have been numerous updates and corrections since; the latest
version (at the time of this publication) was IEEE Std 1003.1-2013 [1]. The
first version of the POSIX standard was developed when a single computer
operating system managed its own (local) file system, and issues of concurrent
access were limited to the processes running on that operating system. Since
all file accesses went through a single operating system on a single machine,
enforcing strict consistency semantics was relatively easy. Likewise, data (and
metadata such as current file size or last access time) could easily be cached,
lowering the cost of accurately tracking last access, update time, and file size.
This is reflected in the design of the API. For example, when retrieving the
list of files in a directory, the readdir function only retrieves file names; to
obtain extra information (such as file size), a call to fstat needs to be made
for each file found. In a time when disks were local and uniquely accessed
by a single computer and metadata could easily be cached, the cost of per-
forming these extra calls was minimal. However, fast forwarding to modern
times, where file systems are often remote (i.e., exported by file servers) and
shared between multiple client computers, each call requires a round trip to a
remote server. Likewise, caching is no longer straightforward as remote inval-
idation is required to keep cache contents consistent. In this environment, the
cost of providing a single global, consistent view of the file system becomes
exceedingly large.
Another problem with the POSIX model is that it forces a single, high-level
data storage model for all applications, with associated costs, regardless of
whether the semantics of the model are appropriate or not for the application
at hand. For example, data can only be stored in a file. Each file has metadata
such as file size and last access time, that are globally visible and consistent
across all clients, whether an application requires this information or not.
Likewise, each file needs to be in a directory. Creating a file in a directory is an
atomic operation, with immediate global visibility. Because of this, file creation
in a distributed file system can be highly synchronizing and consequently
fundamentally unscalable.
Thus, while many distributed applications use more scalable methods in-
ternally for both I/O and data organization, POSIX offers no possibility of
relaxing its strict rules, needlessly limiting application scalability. To mitigate
this, numerous groups have developed additional layers that provide new or-
ganizational models and reorganize access prior to interacting with POSIX
storage, many of which have been discussed previously in this topic, and en-
hancements to POSIX to address scalability limitations have been proposed.
 
Search WWH ::




Custom Search