Hardware Reference
In-Depth Information
policy will grow the granted lock to cover the largest non-conflicting extent
that covers the requested extent. This minimizes the number of DLM requests
needed in common use cases.
Intent lock requests were added to avoid lock ping-pong on highly con-
tended resources. The initial lock request for a resource will also contain su-
cient information (i.e., the client's reason for enqueing the lock) to execute the
request on the server if the lock is contended. If the server chooses to execute
the client's intent on its behalf, it may return a completion status without
any lock, or a lock on a different resource than was originally requested.
Glimpse (trylock) requests and associated lock callbacks were added to
provide a means of taking a transient snapshot of the object state (i.e., fetch-
ing the LVB) if it is currently locked by another client without forcing lock
revocation. If the lock has not recently been used by the holder it will be sched-
uled for cancellation, and if the resource has no users, the lock is granted and
its LVB is returned immediately.
Inode Bit (IBITs) locks were added to allow a single resource to have
multiple attributes for the same lock. This allows finer-grained locking without
increasing DLM RPC trac. The attribute bits may be enqueued together or
separately, though typically they are granted in a single lock for eciency.
Only in the face of contention are the lock bits granted separately.
The DLM is also used for locking the file system configuration data, block
and inode quotas, as well as POSIX flock locking of files by applications.
8.2.4 Back-end Storage
The Lustre OSD is a Lustre-specific abstraction that provides object stor-
age services using different types of persistent storage. The OSD has exclusive
control over its underlying storage and handles all allocation and block meta-
data for it. This aids scalability by containing and distributing this overhead
across the Lustre servers and also provides a layer of security against malicious
or badly behaving clients compared to a shared-disk file system. Current Lus-
tre OSDs are based on ldiskfs, a modified version of ext4 [10, 11] and ZFS [3].
The OSD implements a transactional data store for two types of objects.
Data objects store byte-range extents, and index objects allow ecient access
to key-value pairs. Objects are accessed by a le system{unique 128-bit File
IDentifier (FID) that is composed of a 64-bit sequence number to locate the
storage target, a 32-bit Object ID within the sequence, and a 32-bit version.
Both types of objects also have ecient storage for typical file attributes
such as timestamps, ownership, size, and blocks, as well as named extended
attributes for storage of more complex fields, though management of most
of these attributes is done by layers above the OSD. Only block usage is
managed directly by the OSD, since it allocates the space to store object data
and attributes.
The OSD is responsible for the transactional consistency of its local stor-
age, but it is free to cache object updates and aggregate these as they are
 
Search WWH ::




Custom Search