Reliable Storage - Operating Systems: Principles and Practice

Information Technology Reference

In-Depth Information

In this arrangement, each data block is protected by two parity blocks:

one interdisk parity block on a different disk and on intradisk parity block

on the same disk.

This approach may reduce a disk's eective nonrecoverable read error

rate because if one block in an extent is lost, it can be recovered from the

remaining sectors and parity on the disk. Of course, if multiple blocks in

the same extent are lost, the system must rely on redundancy from other

disks.

(a) Assuming that a disk's nonrecoverable read errors are independent

and occur at a rate of one lost 512 byte sector per 10 15 bits read,

what is the effective nonrecoverable read error rate if the operating

system stores one parity block per seven data blocks on the disk?

Hint: You may find the bc or dc arbitrary-precision calculators use-

ful. These programs are standard in many Unix, Linux, and OSX

distributions. See the man pages for instructions.

(b) Why is the above likely to significantly overstate the impact of intra-

disk redundancy?

9. Many RAID implementations allow on-line repair in which the system

continues to operate after a disk failure, while a new empty disk is inserted

to replaced the failed disk, and while regenerating and copying data to the

new disk.

Sketch a design for a 2-disk, mirrored RAID that allows the system to

remain on-line during reconstruction, while still ensuring that when the

data copying is done, the new disk is properly reconstructed (i.e., it is an

exact copy of other disk.)

In particular, specify (1) what is done by a recovery thread, (2) what is

done on a read during recovery, and (3) what is done on a write during

recovery. Also explain why your system will operate correctly even if a

crash occurs in the middle of reconstruction.

10. Suppose you are willing to sacrice no more than 1% of a disk's bandwidth

to scrubbing. What is maximum frequency at which you could scrub a

1 TB disk with 100 MB/s bandwidth?

11. Suppose a 3 TB disk in a mirrored RAID system crashes. Assuming the

disks used in the system can sustain 100MB/s sequential bandwidth, what

is the minimum mean time to repair that can be achieved? Why might a

system be configured to perform recovery slower than this?

Search WWH ::

Custom Search

Home