Hardware Reference
In-Depth Information
Any requests that modify back-end file system data or metadata must be
made idempotent to ensure that retry attempts due to lost RPC replies are
harmless and that clients receive the correct replies. Clients therefore include a
unique XID in each request to allow repeated requests to be identified and the
server retains a copy of the RPC reply buffer until it has positive confirmation
that the reply was received by the client.
The server also includes both the OSD transaction number in which the re-
quest was executed and the OSD's last committed transaction number in RPC
replies. Clients retain the RPC request and reply buffers until the OSD trans-
action associated with the request has been committed to persistent storage
so that the request can be replayed in the event of server failure.
8.2.3 Distributed Lock Manager
The Lustre Distributed Lock Manager (LDLM) enables Lustre clients and
servers to serialize conflicting file system operations and ensure client caches
remain coherent. Every Lustre storage target provides this service so that locks
are co-located with the objects they protect and total locking concurrency
scales with the number of storage targets. Lustre clients hold locks on behalf of
the entire kernel|eectively aggregating lock requests by all client processes
which are serialized in the client VFS.
The design of the LDLM was based on the VAX/VMS Distributed Lock
Manager [14] in which named abstract resources can be locked in a variety of
modes to control how corresponding physical resources may be accessed. For
example, protected read locks are mutually compatible and are used to permit
read-only access to a shared resource, while protected write locks are used by
clients to cache dirty data, and exclusive locks are incompatible with all lock
modes other than null and are therefore used by servers wishing to modify a
resource and invalidate the caches of all other lock holders.
The VAX/VMS DLM also provided so-called Asynchronous System Traps
(ASTs) to provide notification when locks are granted if the lock could not
be granted immediately upon request or when granted locks block other lock
requests. The LDLM implements these using a callback service on the client.
They enable a lazy locking scheme that minimizes unnecessary communication
by allowing Lustre clients to continue to hold locks protecting their caches
until a conflict actually occurs.
Each lock may also have a Lock Value Block (LVB) that contains infor-
mation about the resource being locked (e.g., size, blocks, and timestamps)
that is passed along with the granted lock to avoid an extra RPC to fetch
commonly used state.
The LDLM added some new features to the original VAX/VMS function-
ality to further increase eciency. Extent locks were added to enable Lustre
clients to lock byte ranges within OST objects to support shared file I/O
by allowing different clients to cache different extents of the same object. In
the case of an uncontended or lightly contended resource, the extent locking
 
Search WWH ::




Custom Search