Parallel Data Storage and Access - Scientific Data Management

Database Reference

In-Depth Information

2.1.2 Fault Tolerance

Distributing data over a large number of servers and disks creates a wide va-

riety of possible failure scenarios, including failure of disks or disk enclosures,

loss of servers, loss or partitioning of the network between clients and servers,

and loss of clients during file system operation.

In the preceding chapter we discussed RAID and its use in tolerating disk

failures. RAID is typically used in the context of a collection of drives at-

tached to a single server in order to tolerate one or more disk failures in that

collection. With drive failures covered, we can begin to consider how to handle

server failures.

Servers can, and do, fail. Two approaches can be taken for handling server

failures: providing an alternative path to the storage that the server man-

ages, or maintaining a replica of or means of reconstructing the data man-

aged by that server. IBM's GPFS 1 is typically configured to use the first

approach. Multiple servers are connected to a single storage unit, often using

Fibre Channel or InfiniBand links. When a server becomes inoperable, another

server attached to the same storage can take its place. In an active-passive

configuration, an additional server is attached and remains idle (passive) until

a failure occurs. Often this configuration shows no performance degradation

after the new server has taken over (the service has failed over ). In an active-

active configuration, another active server takes over responsibility for the

failed server's activities. In this configuration no resources ever sit idle, but

performance degradation is likely when running after a failure.

The Google file system (GFS) 3 takes the second approach. In GFS, data

is replicated on multiple servers. When one server dies, another takes over

responsibility for coordinating changes to data for which the dead server was

previously responsible. A server overseeing the storage as a whole will, on

detecting the death of the server, begin replicating its data using the remain-

ing copies to avoid future failures that could cause permanent data loss. This

approach can be implemented in commodity hardware, allowing for much

lower cost. Additionally, the system can be configured to make more or fewer

replicas of particular data, depending on user needs. This capability allows

for fine-grained control over the trade-off between performance (more copies

cost more time to write) and failure tolerance. On the down side, imple-

menting such a system in software can be complicated, and few enterprise

solutions exist.

Clients can also fail. In fact, most systems have many more clients than

servers and are more likely to see client failures than server failures. Client

failures may have a number of different impacts on file systems. In a system

using PVFS (parallel virtual file system) or NFSv3, a client failure has no

impact on the file system, because clients do not maintain information (state)

necessary for correct file system operation. In a system such as Lustre or GPFS

where locks might be cached on the client, those locks must be reclaimed by

the servers before all file system resources will be accessible. If a file system

Scientific Data Management

Search WWH ::

Custom Search

Home