Reliable Storage - Operating Systems: Principles and Practice

Information Technology Reference

In-Depth Information

For example, the Unix fast file system (FFS) would carefully control the

order that its updates were sent to disk so that if a crash occurred in the middle

of a group of updates, a scan of the disk during recovery could identify and

repair inconsistent data structures. For example, when creating a new file, FFS

would first update the free-inode bitmap to indicate which previously free inode

was now in use; after making sure this update was on disk, it would initialize the

new le's inode, clearing all of the direct, indirect, double-indirect, and other

pointers, setting the le length to 0, setting the le's ownership and access

control list, and so on. Finally, once the inode update was safely on disk, the

file system would update the directory to contain an entry for the newly created

le, mapping the le's name to its inode.

If a system running FFS crashed, then when it rebooted it would use a

program called fsck to scan all of the le system's metadata (e.g., all inodes,

all directories, and all free space bitmaps) to make sure that all metadata items

were consistent. For example, if fsck discovered an inode that was marked as

allocated in the free-inode bitmap but that did not appear in any directory entry,

it could infer that that inode was part of a file in the process of being created

(or deleted) when the crash occurred; since the create had not finished or the

delete had started, fsck could mark the inode as free, undoing the partially

completed create (or completing the partially completed delete.)

Similar logic was used for other file system operations.

This approach of careful ordering of operations with scanning and repair of

on-disk data structures was widespread up until the 1990's, when it was largely

abandoned. In particular, this approach has three significant problems:

1. Complex reasoning. Similar to trying to solve the multi-threaded syn-

chronization problem with just atomic loads and stores, this approach

requires reasoning carefully about all possible operations and all possible

failure scenarios to make sure that it is always possible to recover the

system to a consistent state.

2. Slow updates. To ensure that updates are stored in an order that allowed

the system's state to be analyzed, le systems are forced to insert sync

operations or barriers between dependent operations, reducing the amount

of pipelining or parallelism in the stream of requests to storage devices.

For example, in the file creation example above to ensure that the in-

dividual updates hit disk in the required order, the system might suffer

three full rotations of the disk to update three on-disk data structures

even though those data structures may be quite near each other.

3. Extremely slow recovery. When a machine reboots after a crash, it

has to scan all of its disks for inconsistent metadata structures.

In the 1970's and 1980s, it was possible to scan the data structures on a

disk in a few seconds or a few minutes, but by the 1990's this scanning

Search WWH ::

Custom Search

Home