Databases Reference
In-Depth Information
We're assuming you have
log_bin
and
log_slave_updates
enabled on all your replicas,
as we advised you to do in the beginning of this chapter. Enabling this logging lets you
recover all replicas to a consistent point in time, which you can't reliably do otherwise.
Locating the desired log positions
If any replica isn't at the same position as the new master, you'll have to find the position
in the new master's binary logs corresponding to the last event that replica executed,
and use it for
CHANGE MASTER TO
. You can use the
mysqlbinlog
tool to examine the last
query the replica executed and find that same query in the new master's binary log. A
little math can often help, too.
To illustrate this, let's assume that log events have increasing ID numbers and that the
most up-to-date replica—the new master—had just retrieved event 100 when the old
master crashed. Now let's assume that there are two more replicas,
replica2
and
rep
lica3
;
replica2
had retrieved event 99, and
replica3
had retrieved event 98. If you
point both replicas at the new master's current binary log position, they will begin
replicating event 101, so they'll be out of sync. However, as long as the new master's
binary log was enabled with
log_slave_updates
, you can find events 99 and 100 in the
new master's binary log, so you can bring the replicas back to a consistent state.
Because of server restarts, different configurations, log rotations, or
FLUSH LOGS
com-
mands, the same events can exist at different byte offsets in different servers. Finding
the events can be slow and tedious, but it's usually not hard. Just examine the last event
executed on each replica by running
mysqlbinlog
on the replica's binary log or relay
log. Then find the same query in the new master's binary log, also with
mysqlbinlog
; it
will print the byte offset of the query, and you can use this offset in the
CHANGE MASTER
TO
query.
18
You can make the process faster by subtracting the byte offsets at which the new master
and the replica stopped, which tells you the difference in their byte positions. If you
then subtract this value from the new master's current binary log position, chances are
the desired query will be at that position. You just need to verify that it is, and you've
found the position at which you need to start the replica.
Let's look at a concrete example. Suppose
server1
is the master of
server2
and
server3
, and it crashes. According to
Master_Log_File/Read_Master_Log_Pos
in
SHOW
SLAVE STATUS
,
server2
has managed to replicate all the events that were in
server1
's
binary log, but
server3
isn't as up-to-date.
Figure 10-15
illustrates this scenario (the
log events and byte offsets are for demonstration purposes only).
As
Figure 10-15
illustrates, we can be sure that
server2
has replicated all the events in
the master's binary log because its
Master_Log_File
and
Read_Master_Log_Pos
match
18. As mentioned earlier, heartbeat records from
pt-heartbeat
can be a great help in figuring out approximately
where in a binary log you should be looking for your event.