Information Technology Reference
In-Depth Information
(a) In the program above, just before the system crashes, what is the
value of b3 and what is the value of b4 ?
(b) Suppose that after the program above runs and crashes at the indi-
cated point. After the system restarts and completes recovery and
all write-backs, what are the values stored in each of blocks 1, 2, 3,
4, and 5 of the tStore ?
PROBLEM:
[[compareperformanceof1000updatesinplacev.transaction--
fall2011exam]]
14.2
Error detection and correction
Because data storage hardware is imperfect, storage systems must be designed
to detect and correct errors. Storage systems take a layered approach:
Storage hardware detects many failures with checksums and device-level
checks, and it corrects small corruptions with error correcting codes
Storage systems include redundancy using RAID architectures to recon-
struct data lost by individual devices
Many recent file systems include additional end-to-end correctness checks
These techniques are essential. Essentially all persistent storage devices
include internal redundancy to achieve high storage densities with acceptable
error rates, but the limits of this internal redundancy are significant enough
that it is dicult to imagine designing a storage system for important data
without additional redundancy for error correction, and it is hard to think of a
significant file system developed in the last decade that does not include higher-
level checksums.
Though essential and widespread, there are significant pitfalls in designing
and using these techniques. In our discussions, we will point out issues that, if
not handled carefully, can drastically reduce reliability.
The rest of this section examines error detection and correction for persis-
tent storage, starting with the individual storage devices, then examining how
RAID replication helps tolerate failures by individual storage devices, and finally
looking at the end-to-end error detection in many recent file systems.
Search WWH ::




Custom Search