Scaling MySQL - High Performance MySQL

Databases Reference

In-Depth Information

half the servers fail, the pool should be able to handle the same number of connections

as a whole.

Load Balancing with a Master and Multiple Replicas

The most common replication topology is a single master with multiple replicas. It can

be difficult to move away from this architecture. Many applications assume there's a

single destination for all writes, or that all data will always be available on a single

server. Though this is not the most scalable architecture, there are ways you can use it

to good effect with load balancing. This section examines some of those techniques:

Functional partitioning

You can stretch capacity quite a bit by configuring replicas or groups of replicas

for particular purposes, as discussed previously. Common functions you might

consider separating are reporting and analytics, data warehousing, and full-text

searching. You can find more ideas in Chapter 10 .

Filtering and data partitioning

You can partition data among otherwise similar replicas with replication filters

(see Chapter 10 ). This strategy can work well as long as your data is already sep-

arated into different databases or tables on the master. Unfortunately, there's no

built-in way to filter replication at the level of individual rows. You'd have to do

something creative (read: hackish) to accomplish this, perhaps with triggers and a

bunch of different tables.

Even if you don't partition the data amongst the replicas, you can improve cache

efficiency by partitioning reads instead of distributing them randomly. For in-

stance, you might direct all reads for users whose names begin with the letters

A-M to a given replica, and all reads for users whose names begin with N-Z to

another replica. This helps use each machine's cache more fully, because repeated

reads are more likely to find the relevant data in the cache. In the best case, where

there are no writes, this strategy effectively gives you a total cache size the same as

the two machines' cache sizes combined. In comparison, if you distribute the reads

randomly among the replicas, every machine's cache essentially duplicates the

data, and your total effective cache size is only as big as a single replica's cache, no

matter how many replicas you have.

Moving parts of writes to a replica

The master doesn't always have to do all the work involved in writes. You can save

a significant amount of redundant work for the master and the replicas by decom-

posing write queries and running parts of them on replicas. See Chapter 10 for