Information Technology Reference
In-Depth Information
Atomic update of data and parity
A challenge in implementing RAID is atomically updating both the data and the parity
(or both data blocks in a RAID with mirroring.)
Consider what would happen if the RAID system in Figure 14.6 crashes in the middle
updating block 21, after updating the data block on disk 2 but before updating the parity
block on disk 1. Now, if disk 2 fails, the system will reconstruct the wrong (old) data for
block 21.
The situation may be even worse if a write to a mirrored RAID is interrupted. Because
reads can be serviced by either disk, reads of the inconsistent block may sometimes
return the new value and sometimes return the old one.
Solutions. Three solutions and one non-solution are commonly used to solve (or
not) the atomic update problem.
Nonvolatile write buffer. Hardware RAID systems often include a battery-
backed write buffer. An update is removed from the write buffer only once it is
safely on disk. The RAID's startup procedures ensure that any data in the write
buffer is written to disk after a crash or power outage.
Transactional update. RAID systems can use transactions to atomicaly update
both the data block and the parity block. For example, Oracle's RAID-Z integrates
RAID striping with the ZFS file system to avoid overwriting data in place and to
atomically update data and parity.
Recovery scan. After a crash, the system can scan all of the blocks in the
system and update any inconsistent parity blocks. Note that until that scan is
complete, some parity blocks may be inconsistent, and incorrect data may be
reconstructed if a disk fails. The Linux md (multiple device) software RAID driver
uses this approach.
Cross your fingers. Some software and hardware RAID implementations do not
ensure that the data and parity blocks are in sync after a crash. Caveat emptor.
Search WWH ::




Custom Search