Database Reference
In-Depth Information
2.1.2 Fault Tolerance
Distributing data over a large number of servers and disks creates a wide va-
riety of possible failure scenarios, including failure of disks or disk enclosures,
loss of servers, loss or partitioning of the network between clients and servers,
and loss of clients during file system operation.
In the preceding chapter we discussed RAID and its use in tolerating disk
failures. RAID is typically used in the context of a collection of drives at-
tached to a single server in order to tolerate one or more disk failures in that
collection. With drive failures covered, we can begin to consider how to handle
server failures.
Servers can, and do, fail. Two approaches can be taken for handling server
failures: providing an alternative path to the storage that the server man-
ages, or maintaining a replica of or means of reconstructing the data man-
aged by that server. IBM's GPFS 1 is typically configured to use the first
approach. Multiple servers are connected to a single storage unit, often using
Fibre Channel or InfiniBand links. When a server becomes inoperable, another
server attached to the same storage can take its place. In an active-passive
configuration, an additional server is attached and remains idle (passive) until
a failure occurs. Often this configuration shows no performance degradation
after the new server has taken over (the service has failed over ). In an active-
active configuration, another active server takes over responsibility for the
failed server's activities. In this configuration no resources ever sit idle, but
performance degradation is likely when running after a failure.
The Google file system (GFS) 3 takes the second approach. In GFS, data
is replicated on multiple servers. When one server dies, another takes over
responsibility for coordinating changes to data for which the dead server was
previously responsible. A server overseeing the storage as a whole will, on
detecting the death of the server, begin replicating its data using the remain-
ing copies to avoid future failures that could cause permanent data loss. This
approach can be implemented in commodity hardware, allowing for much
lower cost. Additionally, the system can be configured to make more or fewer
replicas of particular data, depending on user needs. This capability allows
for fine-grained control over the trade-off between performance (more copies
cost more time to write) and failure tolerance. On the down side, imple-
menting such a system in software can be complicated, and few enterprise
solutions exist.
Clients can also fail. In fact, most systems have many more clients than
servers and are more likely to see client failures than server failures. Client
failures may have a number of different impacts on file systems. In a system
using PVFS (parallel virtual file system) or NFSv3, a client failure has no
impact on the file system, because clients do not maintain information (state)
necessary for correct file system operation. In a system such as Lustre or GPFS
where locks might be cached on the client, those locks must be reclaimed by
the servers before all file system resources will be accessible. If a file system
Search WWH ::




Custom Search