Replication - High Performance MySQL

Databases Reference

In-Depth Information

asynchronous —that is, the replica's copy of the data isn't guaranteed to be up-to-date

at any given instant. There are no guarantees as to how large the latency on the replica

might be. Large queries can make the replica fall seconds, minutes, or even hours behind

the master.

MySQL's replication is mostly backward-compatible. That is, a newer server can usu-

ally be a replica of an older server without trouble. However, older versions of the server

are often unable to serve as replicas of newer versions: they might not understand new

features or SQL syntax the newer server uses, and there might be differences in the file

formats replication uses. For example, you can't replicate from a MySQL 5.1 master

to a MySQL 4.0 replica. It's a good idea to test your replication setup before upgrading

from one major or minor version to another, such as from 4.1 to 5.0, or 5.1 to 5.5.

Upgrades within a minor version, such as from 5.1.51 to 5.1.58, are usually

compatible—read the changelog to find out exactly what changed from version to

version.

Replication generally doesn't add much overhead on the master. It requires binary

logging to be enabled on the master, which can have significant overhead, but you need

that for proper backups and point-in-time recovery anyway. Aside from binary logging,

each attached replica also adds a little load (mostly network I/O) on the master during

normal operation. If replicas are reading old binary logs from the master, rather than

just following along with the newest events, the overhead can be a lot higher due to the

I/O required to read the old logs. This process can also cause some mutex contention

that hinders transaction commits. Finally, if you are replicating a very high-throughput

workload (say, 5,000 or more transactions per second) to many replicas, the overhead

of waking up all the replica threads to send them the events can add up.

Replication is relatively good for scaling reads, which you can direct to a replica, but

it's not a good way to scale writes unless you design it right. Attaching many replicas

to a master simply causes the writes to be done many times, once on each replica. The

entire system is limited to the number of writes the weakest part can perform.

Replication is also wasteful with more than a few replicas, because it essentially dupli-

cates a lot of data needlessly. For example, a single master with 10 replicas has 11 copies

of the same data and duplicates most of the same data in 11 different caches. This is

analogous to 11-way RAID 1 at the server level. This is not an economical use of hard-

ware, yet it's surprisingly common to see this type of replication setup. We discuss ways

to alleviate this problem throughout the chapter.

Problems Solved by Replication

Here are some of the more common uses for replication:

Data distribution

MySQL's replication is usually not very bandwidth-intensive, although, as we'll

see later, the row-based replication introduced in MySQL 5.1 can use much more

Search WWH ::

Custom Search

Home