Databases Reference
In-Depth Information
of memory!) can avoid many random I/O operations, or transform them into sequential
ones. However, you should be aware that such a system can reach a slightly delicate
balance over a period of time—one that's easy to perturb with the introduction of a
new query, a schema change, or an infrequent operation.
For example, one SAN user we know was quite happy with its day-to-day performance
until he wanted to purge a lot of rows from an old table that had grown very large. This
resulted in a long-running DELETE statement that was deleting only a couple of hundred
rows per second, because each row required random I/O that the SAN couldn't perform
quickly. There was no way to accelerate the operation; it was simply going to take a
very long time to complete. Another surprise for the same user came when an ALTER
on a large table slowed down to a similar pace.
Those are typical examples of what doesn't work well on a SAN: single-threaded tasks
that perform lots of random I/O. Replication is another single-threaded task in current
versions of MySQL; as a result, replicas whose data is stored on a SAN might be more
likely to lag behind the master. Batch jobs might also run more slowly. You might be
able to perform one-off latency-sensitive operations at off-peak hours or on the week-
end, but always-on parts of the server such as replication, binary logs, and InnoDB's
transaction logs need good performance on small and/or random I/O operations at all
times.
Should You Use a SAN?
Ah, that's the perennial question—in some cases, the million-dollar question. There
are many factors to consider, and we'll list a few of them:
Backups
Centralized storage can make backups easier to manage. When everything is stored
in one place, you can just back up the SAN, and you know that you've accounted
for all of your data. This simplifies questions such as “Are you sure we're backing
up all of our data?” In addition, some devices have features such as continuous
data protection (CDP), and powerful snapshot capabilities that make backups
much easier and more flexible.
Simplified capacity planning
Not sure how much capacity you need? A SAN gives you the ability to buy storage
in bulk, share it, and resize and redistribute it on demand.
Storage consolidation versus server consolidation
Some CIOs take stock of what's running in their data centers and conclude that
there is a lot of wasted I/O capacity, in terms of storage space as well as I/O oper-
ations. No arguments there—but if you centralize your storage to make sure it's
better utilized, how will that impact the systems that use the storage? The difference
in performance for typical database operations can literally be orders of magnitude,
and as a result you might find that you need to run 10 times as many servers (or
more) to handle your workload. And although the data center's I/O capacity might
 
Search WWH ::




Custom Search