Databases Reference
In-Depth Information
has its limitations, such as confusing units of measurement, sampling at intervals that
don't correspond to when the operating system updates the statistics, and the inability
to see all of the metrics at once. If these tools don't meet your needs, you might be
interested in dstat ( http://dag.wieers.com/home-made/dstat/ ) or collectl ( http://collectl
.sourceforge.net ).
We also like to use mpstat to watch CPU statistics; it provides a much better idea of
how the CPUs are behaving individually, instead of grouping them all together. Some-
times this is very important when you're diagnosing a problem. You might find
blktrace to be helpful when you're examining disk I/O usage, too.
We wrote our own replacement for iostat , called pt-diskstats . It's part of Percona Tool-
kit. It addresses some of our complaints about iostat , such as the way that it presents
reads and writes in aggregate, and the lack of visibility into concurrency. It is also
interactive and keystroke-driven, so you can zoom in and out, change the aggregation,
filter out devices, and show and hide columns. It is a great way to slice and dice a sample
of disk statistics, which you can gather with a simple shell script even if you don't have
the tool installed. You can capture samples of disk activity and email or save them for
later analysis. In fact, the pt-stalk , pt-collect , and pt-sift trio of tools that we introduced
in Chapter 3 are designed to work well with pt-diskstats .
A CPU-Bound Machine
The vmstat output for a CPU-bound server usually shows a high value in the us column,
which reports time spent running non-kernel code. There can also be a high value in
the sy column, which is the system CPU usage; a value over 20% here is worrisome.
In most cases, there will also be several processes queued up for CPU time (reported
in the r column). Here's a sample:
$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
10 2 740880 19256 46068 13719952 0 0 2788 11047 1423 14508 89 4 4 3
11 0 740880 19692 46144 13702944 0 0 2907 14073 1504 23045 90 5 2 3
7 1 740880 20460 46264 13683852 0 0 3554 15567 1513 24182 88 5 3 3
10 2 740880 22292 46324 13670396 0 0 2640 16351 1520 17436 88 4 4 3
Notice that there are also a reasonable number of context switches (the cs column),
although we won't worry much about this unless there are 100,000 or more per second.
A context switch is when the operating system stops one process from running and
replaces it with another.
For example, a query that performs a noncovering index scan on a MyISAM table will
read an entry from the index, then read the row from a page on disk. If the page isn't
in the operating system cache, there will be a physical read to the disk, which will cause
a context switch to suspend the process until the I/O completes. Such a query can cause
lots of context switches.
 
Search WWH ::




Custom Search