Database Reference
In-Depth Information
Most storage arrays in use today use RAID (Redundant Array of Independent Disks) as
a way to protect data from physical disk failures. Even though many newer storage
arrays are starting to use different techniques for data protection, it's still important to
understand RAID. Using the right RAID levels can have a big impact on performance
and also on cost of your SQL environments and virtual infrastructures. This section
more than any other will clearly demonstrate how designing for performance will take
care of capacity, at least where using RAID is involved, especially as you reach for
even higher performance from your SQL databases. Before we discuss RAID penalties,
we will cover some IO characteristics that have a direct performance impact when used
with RAID.
Randomness of IO Pattern
The randomness of IO is a very important consideration in storage design and has a
direct impact on IO latency and throughput when using spinning disks. Most
virtualization environments will generate a completely random IO pattern, even with
sequential IO from individual VMs, as we covered previously in “ The IO Blender
Effect .” This is because the underlying VMFS data stores are shared between mult iple
VMs in most cases. With SQL Server, you will have cases where VMs should still
share some common VMFS data stores, as we have covered, in order to get maximum
performance utilization from your arrays.
The reason that random IOs have such an impact is because the disk heads have to move
between different sectors and the disk has to spin around to the correct location for a
block to be read from or written to. For this reason, the average seek time and rotational
speed of the disks are very important. On average, the disk heads will need to wait for
50% of the disk to spin past it prior to performing a read or write operation. Each
operation is then multiplied by the RAID penalties of that operation.
The impact of randomness on reads can be worse than the randomness for writes. In
most storage systems, writes will be cached (backed by battery or some other persistent
form), ordered, and then written to physical disk in a way that reduces the overall
impact. For reads, however, the chances of getting a cache hit in your array when the
randomness increases are very low; therefore, most reads may have to come from
spinning disks. The alternative would be to assign very large read cache on the array,
but that is not efficient or cost effective in most cases, and still may not result in
significant cache hits. The end result is that many more disks may be needed to get the
best read latency and throughput for your database.
Fortunately, SQL is very good at caching, and this is why the buffer pool in a SQL
Database is so big. This is also the reason there is a direct tradeoff between assigning
SQL Server RAM and using it in the buffer pool and read IO from disk. This becomes
especially important when things fail, such as disks in your RAID groups, which causes
 
Search WWH ::




Custom Search