Replication - High Performance MySQL

Databases Reference

In-Depth Information

• A very long transaction can cause the reported lag to fluctuate. For example, if you

have a transaction that updates data, stays open for an hour, and then commits,

the update will go into the binary log an hour after it actually happened. When the

replica processes the statement, it will temporarily report that it is an hour behind

the master, and then it will jump back to zero seconds behind.

• If a distribution master is falling behind and has replicas of its own that are caught

up with it, the replicas will report that they are zero seconds behind, even if there

is lag relative to the ultimate master.

The solution to these problems is to ignore Seconds_behind_master and monitor replica

lag with something you can observe and measure directly. The best solution is a heart-

beat record , which is a timestamp that you update once per second on the master. To

calculate the lag, you can simply subtract the heartbeat from the current timestamp on

the replica. This method is immune to all the problems we just mentioned, and it has

the added benefit of creating a handy timestamp that shows to what point in time the

replica's data is current. The pt-heartbeat script, included in Percona Toolkit, is the

most popular implementation of a replication heartbeat.

A heartbeat has other benefits, too. The replication heartbeat records in the binary log

are useful for many purposes, such as disaster recovery in otherwise hard-to-solve

scenarios.

None of the lag metrics we just mentioned gives a sense of how long it will take for a

replica to actually catch up to the master. This depends upon many factors, such as

how powerful the replica is and how many write queries the master continues to pro-

cess. See the section “When Will Replicas Begin to Lag?” on page 484 for more on that

topic.

Determining Whether Replicas Are Consistent with the Master

In a perfect world, a replica would always be an exact copy of its master. But in the real

world, errors in replication can cause the replica's data to “drift” out of sync with the

master's. Even if there are apparently no errors, replicas can still get out of sync because

of MySQL features that don't replicate correctly, bugs in MySQL, network corruption,

crashes, ungraceful shutdowns, or other failures. 16

Our experience is that this is the rule, not the exception, which means checking your

replicas for consistency with their masters should probably be a routine task. This is

especially important if you use replication for backups, because you don't want to take

backups from a corrupted replica.

MySQL has no built-in method of determining whether one server has the same data

as another server. It does provide some building blocks for checksumming tables and

16. If you're using a nontransactional storage engine, shutting down the server without first running STOP

SLAVE is ungraceful.

Search WWH ::

Custom Search

Home