Databases Reference
In-Depth Information
data, such as CHECKSUM TABLE . However, it's nontrivial to compare a replica to its master
while replication is working.
Percona Toolkit has a tool called pt-table-checksum that solves this and several other
problems. The tool's main feature is that it can verify that a replica's data is in sync
with its master's data. It works by running INSERT ... SELECT queries on the master.
These queries checksum the data and insert the results into a table. The statements
flow through replication and execute again on the replica. You can then compare the
results on the master to the results on the replica and see whether the data differs.
Because this process works through replication, it gives consistent results without the
need to lock tables on both servers simultaneously.
A typical way to use the tool is to run it on the master, with parameters similar to the
following:
$ pt-table-checksum --replicate=test.checksum <master_host>
This command checksums all tables and inserts the results into the test.checksum table.
After the queries have executed on the replicas, a simple query can check each replica
for differences from the master. pt-table-checksum can discover the server's replicas,
run the query on each replica, and output the results automatically. At the time of this
writing, pt-table-checksum is the only tool that can reliably compare a replica's data to
its master's.
Resyncing a Replica from the Master
You'll probably have to deal with an out-of-sync replica more than once in your career.
Perhaps you used the checksum technique and found differences; perhaps you know
that the replica skipped a query or that someone changed the data on the replica.
The traditional advice for fixing an out-of-sync replica is to stop it and reclone it from
the master. If an inconsistent replica is a critical problem, you should probably stop it
and remove it from production as soon as you find it. You can then reclone the replica
or restore it from a backup.
The drawback to this approach is the inconvenience factor, especially if you have a lot
of data. If you can find out which data is different, you can probably do it more effi-
ciently than by recloning the entire server. And if the inconsistency you discovered isn't
critical, you might be able to leave the replica online and resync only the affected data.
The simplest fix is to dump and reload only the affected data with mysqldump . This
can work very well if your data isn't changing while you do it. You can simply lock the
table on the master, dump the table, wait for the replica to catch up to the master, and
then import the table on the replica. (You need to wait for the replica to catch up so
you don't introduce more inconsistencies in other tables, such as those that might be
updated in joins against the out-of-sync table.)
 
Search WWH ::




Custom Search