Databases Reference
In-Depth Information
Row-based logging is not backward-compatible. The mysqlbinlog utility
distributed with MySQL 5.1 can read binary logs that contain events
logged in row-based format (they are not human-readable, but the
MySQL server can interpret them). However, versions of mysqlbinlog
from earlier MySQL distributions will fail to recognize such log events
and will exit with an error upon encountering them.
MySQL can replicate some changes more efficiently using row-based replication, be-
cause the replica doesn't have to replay the queries that changed the rows on the master.
Replaying some queries can be very expensive. For example, here's a query that sum-
marizes data from a very large table into a smaller table:
mysql> INSERT INTO summary_table(col1, col2, sum_col3)
-> SELECT col1, col2, sum(col3)
-> FROM enormous_table
-> GROUP BY col1, col2;
Imagine that there are only three unique combinations of col1 and col2 in the
enormous_table table. This query will scan many rows in the source table but will result
in only three rows in the destination table. Replicating this event as a statement will
make the replica repeat all that work just to generate a few rows, but replicating it with
row-based replication will be trivially cheap on the replica. In this case, row-based
replication is much more efficient.
On the other hand, the following event is much cheaper to replicate with statement-
based replication:
mysql> UPDATE enormous_table SET col1 = 0;
Using row-based replication for this query would be very expensive because it changes
every row: every row would have to be written to the binary log, making the binary log
event extremely large. This would place more load on the master during both logging
and replication, and the slower logging might reduce concurrency.
Because neither format is perfect for every situation, MySQL can switch between
statement-based and row-based replication dynamically. By default, it uses statement-
based replication, but when it detects an event that cannot be replicated correctly with
a statement, it switches to row-based replication. You can also control the format as
needed by setting the binlog_format session variable.
It's harder to do point-in-time recovery with a binary log that has events in row-based
format, but not impossible. A log server can be helpful—more on that later.
Statement-Based or Row-Based: Which Is Better?
We've mentioned advantages and disadvantages for both replication formats. Which
is better in practice?
 
Search WWH ::




Custom Search