Databases Reference
In-Depth Information
To generate the parity information, a RAID controller reads relevant data and performs an XOR
calculation. Let's take a small RAID set of four disks as an example. One write operation will gen-
erate two writes and two reads. We are assuming that there are two existing data chunks and we are
writing the third. We need to write the resultant parity and the new data block.
RAID controllers vary greatly in design, but generally speaking, they utilize their internal cache to
assist in the generation of parity information. Typically, Raid 5 enables N number of reads and N
divided by 4 writes.
RAID 6 protects against double disk failure and therefore generates double the parity. An 8-disk
RAID set consists of two parity chunks and six data chunks. You need to write the new data chunk
and two parity chunks, so you know that you have three writes. You need to read the other i ve data
chunks, so you are looking at eight operations to complete the RAID 6 write. Luckily, most RAID
controllers can optimize this process into three reads and three writes.
Table 4-2 provides a guide to calculating common RAID overhead. Please remember that each sys-
tem is different and your mileage may vary.
TABLE 4-2: RAID Overhead
RAID TYPE
READ
WRITE
0
N
N
1+0
N
N ÷ 2
5
N
N ÷ 4
6
N
N ÷ 6
Sequential Disk Access
Microsoft SQL Server and various hardware manufacturers partner to provide guidance for data
warehouse systems. This program is called SQL Server Fast Track Data Warehouse. A data ware-
house system is designed to hold a massive amount of data. The Fast Track program takes great care
to design storage hardware that is perfectly sized for a specii c server platform.
The data warehouse is architected such that data is sequentially loaded and sequentially accessed.
Typically, data is i rst loaded in a staging database. Then it is bulk loaded and ordered so that que-
ries generate a sequential table access pattern. This is important because sequential disk access is
far more efi cient than random disk access. Our 15,000-RPM disk drive will perform 180 random
operations or 1,400 sequential reads. Sequential operations are so much more efi cient than random
access that SQL Server is specii cally designed to optimize sequential disk access.
In a worst-case scenario, SQL Server will read 64KB data extents and write 8K data pages. When
SQL Server detects sequential access it dynamically increases the request size to a maximum size of
512KB. This has the effect of making the storage more efi cient.
Designing an application to generate sequential disk access is a powerful cost-saving tool. Blending
sequential operations with larger I/O is even more powerful. If our 15,000-RPM disk performs
 
Search WWH ::




Custom Search