Database Reference
In-Depth Information
devices have improved tremendously over the past decade, still they have two
inherent problems. First, despite major advances, disk storage is still slow for data
movement compared to the other hardware components such as main memory
and the processor. Second, disk storage devices have more mechanical parts com-
pared to the other components. Therefore, disk storage devices are more prone to
failures.
RAID (redundant array of inexpensive disks) technology addresses these two
problems as follows:
Improve performance through data striping. Stripe or spread data across
several disks so that storage or retrieval of required data may be accomplished
by parallel reads across many disks. When data is stored on a single disk unit,
you may need sequential reads, each sequential read waiting for the previous
one to finish.
Improve reliability by storing redundant data to be used only for reconstruct-
ing data on failed units.
We will now examine how RAID technology provides these improvements. In
today's database environment, the use of RAID technology has become essential.
Performance Improvement Data striping enables an array of disks to be
abstracted and viewed as one large logical disk. While striping data across disks,
you take a specific unit of data and store it on the first disk in the array, then
take the next unit of data and store it on the next disk in the array, and so on.
These units of data are called striping units. A striping unit may be one full disk
block, or it may even be one single bit. The striping units are spread across the array
of disks in a round-robin fashion. Let us say that the data you want to stripe
across 6 disks in an array consists of 24 striping units. These 24 units of data are
striped across the disks beginning from the first disk, going to the last disk, and then
wrapping around. Figure 12-10 illustrates the storage of the 24 striping units on the
6 disks.
Now assume that you want to retrieve the first five striping units in a certain
query to the database. These units will be read in parallel from all the first five disks
in just the time taken to read one unit of data. Striping units of data across the disk
array results in tremendous reduction in data access times.
Reliability Improvement The greater number of disks in an array, the better the
access performance is likely to be. This is because you can retrieve from more disks
in parallel. However, having more disks also increases the risk of failure. In RAID
technology, redundant parity data written on check disks enable recovery from disk
failures. Therefore, in the RAID system, over and above the data disks, you need
additional check disks to store parity data. Let us see how the parity data stored in
the check disks are used for recovery from disk failures.
How Parity Works Assume that a disk array consists of eight disks. Now, con-
sider the first data bit on each of these eight disks. Review the 1s and 0s written as
the first bit on each disk. Let us say that five of these first bits are 1s as follows:
Search WWH ::




Custom Search