Databases Reference
In-Depth Information
Some people like to achieve consolidation with virtualization, which has its benefits.
But virtualization also has a pretty hefty performance cost itself in many cases. It de-
pends on the technology, but it's usually noticeable, and the overhead is especially
exaggerated when I/O is very fast. As an alternative, you can run multiple MySQL
instances, each listening on different network ports or binding to different IP addresses.
We've been able to achieve a consolidation factor of up to 10x or 15x on powerful
hardware. You'll have to balance the cost of the administrative complexity with the
benefit of better performance to determine what's best for you.
At this point, the network is likely to become the bottleneck—a problem most MySQL
users don't run into very often. You can address the problem by using multiple NICs
and bonding them. The Linux kernel isn't ideal for this, depending on the version,
because older kernels can use only one CPU for network interrupts per bonded device.
As a result, you shouldn't bond too many cables into too few virtual devices, or you'll
run into a different network bottleneck inside the kernel. Newer kernels should help
with this, so check your distribution to find out what your options are.
Another way you can get more out of this strategy is to bind each MySQL instance to
specific cores. This helps for two reasons: first because MySQL gets more performance
per core at lower core counts due to its internal scalability limitations, and second when
an instance is running threads on many cores, there's less overhead due to synchro-
nizing shared data between the cores. This helps avoid the scalability limitations of the
hardware itself. Limiting MySQL to only some cores can reduce the crosstalk between
CPU cores. Notice the recurring theme? Pin the process to cores that are on the same
physical socket for the best results.
Scaling by Clustering
The dream scenario for scaling is a single logical database that can hold as much data,
serve as many queries, and grow as large as you need it to. Many people's first thought
is to create a “cluster” or “grid” that handles this seamlessly, so the application doesn't
need to do any dirty work or know that the data really lives in many servers instead of
just one. With the rise of the cloud, autoscaling—dynamically adding servers to or
removing them from the cluster in response to changes in workload or data size—is
also becoming interesting.
In the second edition of this topic, we expressed our regret that the available technology
wasn't really up to the task. Since then, a lot of the buzz has centered around so-called
NoSQL technologies. Many NoSQL proponents made strange and unsubstantiated
claims such as “the relational model can't scale,” or “SQL can't scale.” New concepts
emerged, and new catchphrases were on everyone's lips. Who hasn't heard of eventual
consistency, BASE, vector clocks, or the CAP theorem these days?
But as time has passed, sanity has been at least partially restored. Experience is begin-
ning to reveal that many of the NoSQL databases are primitive in their own ways and
 
Search WWH ::




Custom Search