Databases Reference
In-Depth Information
bread-and-butter SELECT , UPDATE , INSERT , and so on. 1 A database server's performance
is measured by query response time, and the unit of measurement is time per query.
Now for another rhetorical question: what is optimization? We'll return to this later,
but for now let's agree that performance optimization is the practice of reducing re-
sponse time as much as possible 2 for a given workload.
We find that many people are very confused about this. If you think performance op-
timization requires you to reduce CPU utilization, for example, you're thinking about
reducing resource consumption. But this is a trap. Resources are there to be consumed.
Sometimes making things faster requires that you increase resource consumption.
We've upgraded many times from an old version of MySQL with an ancient version of
InnoDB, and witnessed a dramatic increase in CPU utilization as a result. This is usually
nothing to be concerned about. It usually means that the newer version of InnoDB is
spending more time doing useful work and less time fighting with itself. Looking at
query response time is the best way to know whether the upgrade was an improvement.
Sometimes an upgrade introduces a bug such as not using an index, which can also
manifest as increased CPU utilization. CPU utilization is a symptom, not a goal, and
it's best to measure the goal, or you could get derailed.
Similarly, if you thought that performance optimization was about improving queries
per second, then you were thinking about throughput optimization. Increased through-
put can be considered as a side effect of performance optimization. 3 Optimizing queries
makes it possible for the server to execute more queries per second, because each one
requires less time to execute when the server is optimized. (The unit of throughput is
queries per time, which is the inverse of our definition of performance.)
So if the goal is to reduce response time, we need to understand why the server requires
a certain amount of time to respond to a query, and reduce or eliminate whatever
unnecessary work it's doing to achieve the result. In other words, we need to measure
where the time goes. This leads to our second important principle of optimization: you
cannot reliably optimize what you cannot measure . Your first job is therefore to measure
where time is spent.
1. We don't distinguish between queries and statements, DDL and DML, and so on. If you send a command
to the server, no matter what it is, you just care about how quickly it executes. We tend to use “query”
as a catch-all phrase for any command you send.
2. We'll mostly avoid philosophical discussions about performance optimization, but we have two
suggestions for further reading. There is a white paper called Goal-Driven Performance Optimization on
Percona's website (http://www.percona.com), which is a compact quick-reference sheet. It is also very
worthwhile to read Cary Millsap's book Optimizing Oracle Performance (O'Reilly). Cary's performance
optimization method, Method R, is the gold standard in the Oracle world.
3. Some people define performance in terms of throughput, which is okay, but it's not the definition we use
here. We think response time is more useful, although throughput is often easier to measure in
benchmarks.
 
Search WWH ::




Custom Search