OneFS - High Performance Parallel I/O

Hardware Reference

In-Depth Information

failures decreases. This has an enormous eciency advantage in maintaining

the resiliency of clusters as their size increases.

11.5.3 Virtual Hot Spare

Most traditional storage systems based on RAID require the provisioning

of one or more \hot spare" drives to allow independent recovery of failed

drives. The hot spare drive replaces the failed drive in a RAID set. If these

hot spares are not themselves replaced before more failures appear, the system

risks a catastrophic data loss. OneFS avoids the use of hot spare drives, and

simply borrows from the available free space in the system in order to recover

from failures; this technique is called virtual hot spare. In doing so, OneFS

allows the cluster to be fully self-healing, without human intervention. The

administrator can create a virtual hot spare reserve, allowing for a guarantee

that the system can self-heal despite ongoing writes by users.

11.5.4 N + M Data Protection

The Isilon cluster is designed to tolerate one or more simultaneous com-

ponent failures without preventing the cluster from serving data. The Isilon

system can use either a Reed{Solomon error correction (N + M protection)

system, or a mirroring system for files. Data protection is applied at the file

level, and not the system level, enabling the system to focus on recovering

only those files that are compromised by a failure rather than having to check

and repair the entire file set. Metadata and inodes are protected at least at

the same level of protection as the data they reference. Metadata and inodes

are always protected by mirroring, rather than Reed{Solomon coding.

Because all data, metadata, and parity information are distributed across

the nodes of the cluster, the Isilon cluster does not require a dedicated parity

node or drive, or a dedicated device or set of devices to manage metadata.

This ensures that no one node can become a single point of failure. All nodes

share equally in the tasks to be performed, providing perfect symmetry and

load balancing in a peer-to-peer architecture.

The Isilon system provides several levels of configurable data protection

settings, which can be modified at any time without needing to take the cluster

or file system oine.

For a file protected with erasure codes, each of its protection groups is

protected at a level of N + Mb, where N > M and M b. The values N and

M represent, respectively, the number of drives used for data and for erasure

codes within the protection group. The value of b relates to the number of data

stripes used to lay out that protection group, and is covered below. A common

and easily understood case is where b = 1, implying that a protection group

incorporates N drives worth of data; M drives worth of redundancy, stored in

erasure codes; and that the protection group should be laid out over exactly

one stripe across a set of nodes. This implies that M members of the protection

Search WWH ::

Custom Search

Home