Databases Reference
In-Depth Information
server-side prepared statements, and the compressed protocol. You can also use
MySQL Proxy with a logging script, but in practice we rarely do this.
Analyzing the query log
We suggest that at least every now and then you should use the slow query log to
capture all queries executing on your server, and analyze them. Log the queries for some
representative period of time, such as an hour during your peak traffic time. If your
workload is very homogeneous, a minute or less might even be enough to find bad
queries that need to be optimized.
Don't just open up the log and start looking at it directly—it's a waste of time and
money. Generate a profile first, and if you need to, then you can go look at specific
samples in the log. It's best to work from a high-level view down to the low level, or
you could de-optimize the business, as mentioned earlier.
Generating a profile from the slow query log requires a good log analysis tool. We
suggest pt-query-digest , which is arguably the most powerful tool available for MySQL
query log analysis. It supports a large variety of functionality, including the ability to
save query reports to a database and track changes in workload over time.
By default, you simply execute it and pass it the slow query log file as an argument, and
it just does the right thing. It prints out a profile of the queries in the log, and then
selects “important” classes of queries and prints out a detailed report on each one. The
report has dozens of little niceties to make your life easier. We continue to develop this
tool actively, so you should read the documentation for the most recent version to learn
about its current functionality.
We'll give you a brief tour of the report pt-query-digest prints out, beginning with the
profile. Here is an uncensored version of the profile we showed earlier in this chapter:
# Profile
# Rank Query ID Response time Calls R/Call V/M Item
# ==== ================== ================ ===== ====== ===== =======
# 1 0xBFCF8E3F293F6466 11256.3618 68.1% 78069 0.1442 0.21 SELECT InvitesNew?
# 2 0x620B8CAB2B1C76EC 2029.4730 12.3% 14415 0.1408 0.21 SELECT StatusUpdate?
# 3 0xB90978440CC11CC7 1345.3445 8.1% 3520 0.3822 0.00 SHOW STATUS
# 4 0xCB73D6B5B031B4CF 1341.6432 8.1% 3509 0.3823 0.00 SHOW STATUS
# MISC 0xMISC 560.7556 3.4% 23930 0.0234 0.0 <17 ITEMS>
There's a little more detail here than we saw previously. First, each query has an ID,
which is a hash of its “fingerprint.” A fingerprint is the normalized, canonical version
of the query with literal values removed, whitespace collapsed, and everything lower-
cased (notice that queries 3 and 4 appear to be the same, but they have different fin-
gerprints). The tool also merges tables with similar names into a canonical form. The
question mark at the end of the InvitesNew table name signifies that there is a shard
identifier appended to the table name, and the tool has removed that so that queries
against tables with a similar purpose are aggregated together. This report is from a
heavily sharded Facebook application.
 
Search WWH ::




Custom Search