Reliable Storage - Operating Systems: Principles and Practice

Information Technology Reference

In-Depth Information

(a) In the program above, just before the system crashes, what is the

value of b3 and what is the value of b4 ?

(b) Suppose that after the program above runs and crashes at the indi-

cated point. After the system restarts and completes recovery and

all write-backs, what are the values stored in each of blocks 1, 2, 3,

4, and 5 of the tStore ?

PROBLEM:

[[compareperformanceof1000updatesinplacev.transaction--

fall2011exam]]

14.2

Error detection and correction

Because data storage hardware is imperfect, storage systems must be designed

to detect and correct errors. Storage systems take a layered approach:

Storage hardware detects many failures with checksums and device-level

checks, and it corrects small corruptions with error correcting codes

Storage systems include redundancy using RAID architectures to recon-

struct data lost by individual devices

Many recent file systems include additional end-to-end correctness checks

These techniques are essential. Essentially all persistent storage devices

include internal redundancy to achieve high storage densities with acceptable

error rates, but the limits of this internal redundancy are significant enough

that it is dicult to imagine designing a storage system for important data

without additional redundancy for error correction, and it is hard to think of a

significant file system developed in the last decade that does not include higher-

level checksums.

Though essential and widespread, there are significant pitfalls in designing

and using these techniques. In our discussions, we will point out issues that, if

not handled carefully, can drastically reduce reliability.

The rest of this section examines error detection and correction for persis-

tent storage, starting with the individual storage devices, then examining how

RAID replication helps tolerate failures by individual storage devices, and finally

looking at the end-to-end error detection in many recent file systems.

Search WWH ::

Custom Search

Home