Scaling MySQL - High Performance MySQL

Databases Reference

In-Depth Information

Some people like to achieve consolidation with virtualization, which has its benefits.

But virtualization also has a pretty hefty performance cost itself in many cases. It de-

pends on the technology, but it's usually noticeable, and the overhead is especially

exaggerated when I/O is very fast. As an alternative, you can run multiple MySQL

instances, each listening on different network ports or binding to different IP addresses.

We've been able to achieve a consolidation factor of up to 10x or 15x on powerful

hardware. You'll have to balance the cost of the administrative complexity with the

benefit of better performance to determine what's best for you.

At this point, the network is likely to become the bottleneck—a problem most MySQL

users don't run into very often. You can address the problem by using multiple NICs

and bonding them. The Linux kernel isn't ideal for this, depending on the version,

because older kernels can use only one CPU for network interrupts per bonded device.

As a result, you shouldn't bond too many cables into too few virtual devices, or you'll

run into a different network bottleneck inside the kernel. Newer kernels should help

with this, so check your distribution to find out what your options are.

Another way you can get more out of this strategy is to bind each MySQL instance to

specific cores. This helps for two reasons: first because MySQL gets more performance

per core at lower core counts due to its internal scalability limitations, and second when

an instance is running threads on many cores, there's less overhead due to synchro-

nizing shared data between the cores. This helps avoid the scalability limitations of the

hardware itself. Limiting MySQL to only some cores can reduce the crosstalk between

CPU cores. Notice the recurring theme? Pin the process to cores that are on the same

physical socket for the best results.

Scaling by Clustering

The dream scenario for scaling is a single logical database that can hold as much data,

serve as many queries, and grow as large as you need it to. Many people's first thought

is to create a “cluster” or “grid” that handles this seamlessly, so the application doesn't

need to do any dirty work or know that the data really lives in many servers instead of

just one. With the rise of the cloud, autoscaling—dynamically adding servers to or

removing them from the cluster in response to changes in workload or data size—is

also becoming interesting.

In the second edition of this topic, we expressed our regret that the available technology

wasn't really up to the task. Since then, a lot of the buzz has centered around so-called

NoSQL technologies. Many NoSQL proponents made strange and unsubstantiated

claims such as “the relational model can't scale,” or “SQL can't scale.” New concepts

emerged, and new catchphrases were on everyone's lips. Who hasn't heard of eventual

consistency, BASE, vector clocks, or the CAP theorem these days?

But as time has passed, sanity has been at least partially restored. Experience is begin-

ning to reveal that many of the NoSQL databases are primitive in their own ways and

Search WWH ::

Custom Search

Home