Databases Reference
In-Depth Information
The RAID stripe chunk size
The optimal stripe chunk size is workload- and hardware-specific. In theory, it's good
to have a large chunk size for random I/O, because it means more reads can be satisfied
from a single drive.
To see why this is so, consider the size of a typical random I/O operation for your
workload. If the chunk size is at least that large, and the data doesn't span the border
between chunks, only a single drive needs to participate in the read. But if the chunk
size is smaller than the amount of data to be read, there's no way to avoid involving
more than one drive in the read.
So much for theory. In practice, many RAID controllers don't work well with large
chunks. For example, the controller might use the chunk size as the cache unit in its
cache, which could be wasteful. The controller might also match the chunk size, cache
size, and read-unit size (the amount of data it reads in a single operation). If the read
unit is too large, its cache might be less effective, and it might end up reading a lot more
data than it really needs, even for tiny requests.
Also, in practice it's hard to know whether any given piece of data will span multiple
drives. Even if the chunk size is 16 KB, which matches InnoDB's page size, you can't
be certain all of the reads will be aligned on 16 KB boundaries. The filesystem might
fragment the file, and it will typically align the fragments on the filesystem block size,
which is often 4 KB. Some filesystems might be smarter, but you shouldn't count on it.
You can configure the system so that blocks are aligned all the way from the application
down to the underlying storage: InnoDB's blocks, the filesystem's blocks, LVM, the
partition offset, the RAID stripe, and disk sectors. Our benchmarks showed that when
everything is aligned, there can be a performance improvement on the order of 15% to
23% for random reads and random writes, respectively. The exact techniques for align-
ing everything are too specific to cover here, but there's a lot of good information on it
elsewhere, including our blog, http://www.mysqlperformanceblog.com .
The RAID cache
The RAID cache is a (relatively) small amount of memory that is physically installed
on the RAID controller. It can be used to buffer data as it travels between the disks and
the host system. Here are some of the reasons a RAID card might use the cache:
Caching reads
After the controller reads some data from the disks and sends it to the host system,
it can store the data; this will enable it to satisfy future requests for the same data
without having to go to disk again.
This is usually a very poor use of the RAID cache. Why? Because the operating
system and the database server have their own, much larger, caches. If there's a
cache hit in one of these caches, the data in the RAID cache won't be used. Con-
versely, if there's a miss in one of the higher-level caches, the chance that there'll
 
Search WWH ::




Custom Search