Hardware Reference
In-Depth Information
updates that affect file system consistency in a recovery log. There is a sep-
arate log for each node stored on the shared disks. When GPFS detects a
node failure, using its internal heartbeat mechanism, a different cluster node
reads and re-applies updates recorded in the failed node's log before locks that
were held by the failed node are released. This guarantees that any metadata
updated by the failed node is quickly restored to a consistent state and can
then be accessed again by other nodes.
To protect against data loss or the unavailability of data due to failures in
the disk subsystem, GPFS provides two options: use of RAID-based storage
controllers together with redundant paths to disk, or replication at the file
system level. As an alternative to traditional RAID controllers, GPFS also
offers an advanced software RAID implementation integrated into the NSD
server called GPFS Native RAID (GNR) [7, 8]. If file system replication is
chosen, GPFS allocates and writes two or more copies of each data block
and/or metadata object.
To avoid data being unavailable due to maintenance, GPFS supports on-
line system management. This includes the ability to grow or shrink a file
system by adding or removing disks and optionally rebalancing data and
metadata in response to disk configuration changes while the file system is
mounted. System software, including GPFS, can also be upgraded one node
at a time without ever taking down the whole cluster.
9.2.3
Distributed Locking and Metadata Management
9.2.3.1
The Distributed Lock Manager
The GPFS distributed lock manager uses a collection of global lock man-
agers running on a designated subset of nodes in the cluster, in conjunction
with local lock managers in each file system node. For each file, directory, or
other file system object, a hash of the object ID is used to select one of the
global lock manager nodes to coordinate distributed locks for that object by
handing out lock tokens. Once a node has obtained a token from the global
lock manager responsible for the object, subsequent operations accessing the
object on that node can lock the object locally, without requiring additional
network communication. Additional network communication is only necessary
when an operation on another node requires a conflicting lock on the same
object.
Lock tokens also serve as the mechanism for maintaining cache consistency
between nodes. A \read-only" token may be shared among nodes and allows
each token holder to cache objects it has read from disk. An \exclusive-write"
token may only be held by one node at a time and allows the node to modify
the object in its cache. When a write token is revoked or downgraded to
read-only mode, GPFS first waits for local locks to be released and commits
local changes to disk before allowing the token to be granted to another node.
This serializes reads and writes to support the POSIX semantic that ensures
 
Search WWH ::




Custom Search