Databases Reference
In-Depth Information
If you have only the standard MySQL slow query log, with no execution plan or detailed
timing information, you are limited to the knowledge that the query ran badly at the
point that it was logged—you can't see why that was. Perhaps something else was
consuming resources on the system, such as a backup, or perhaps some kind of locking
or contention blocked the query's progress. Intermittent problems are a special case
that we'll cover in the next section.
Diagnosing Intermittent Problems
Intermittent problems such as occasional server stalls or slow queries can be frustrating
to diagnose, and the most egregious wastes of time we've seen have been results of
phantom problems that happen only when you're not looking, or aren't reliably re-
producible. We've seen people spend literally months fighting with such problems. In
the process, some of them reverted to a trial-and-error troubleshooting approach, and
sometimes made things dramatically worse by trying to change things such as server
settings at random, hoping to stumble upon something that would help.
Try to avoid trial and error if you can. Trial-and-error troubleshooting is risky, because
the results can be bad, and it can be frustrating and inefficient. If you can't figure out
what the problem is, you might not be measuring correctly, you might be measuring
in the wrong place, or you might not know the necessary tools to use. (Or the tools
might not exist—we've developed a number of tools specifically to address the lack
of transparency in various system components, from the operating system to MySQL
itself.)
To illustrate the importance of trying to avoid trial and error, here are some sample
resolutions we've found to some of the intermittent database performance problems
we've been called in to solve:
• The application was executing curl to fetch exchange rate quotes from an external
service, which was running very slowly at times.
• Important cache entries were expiring from memcached , causing the application
to flood MySQL with requests to regenerate the cached items.
• DNS lookups were timing out randomly.
• The query cache was freezing MySQL periodically due to mutex contention or
inefficient internal algorithms for deleting cached queries.
• InnoDB scalability limitations were causing query plan optimization to take too
long when concurrency was over some threshold.
As you can see, some of these problems were in the database, and some of them weren't.
Only by beginning at the place where the misbehavior could be observed and working
through the resources it used, measuring as completely as possible, can you avoid
hunting in the wrong place for problems that don't exist there.
 
Search WWH ::




Custom Search