Benchmarking MySQL - High Performance MySQL

Databases Reference

In-Depth Information

• Each file is named after the date and hour when the benchmark is run. When

benchmarks last for days and the files grow large, you might find it handy to move

previous files off the server and free up some disk space if needed, and get a head

start on analyzing the full results. When you're looking for data about a specific

point in time, it's also nice to be able to find it in a file named after the hour, rather

than searching through a single file that has grown to gigabytes in size.

• Each sample begins with a distinctive timestamp line, so you can search through

the files for samples related to specific times, and you can write little awk and sed

scripts easily.

• The script doesn't preprocess or filter anything it gathers. It's a good idea to gather

everything in its raw form, and process and filter it later. If you preprocess it, you'll

surely find yourself wishing for the raw data later when you find an anomaly and

need more data to understand it.

• You can make the script exit when the benchmark is done by removing the /home/

benchmarks/running file in the script that executes your benchmark.

This is just a short code snippet, and probably won't meet your needs as-is, but it's an

illustration of a good general approach to capturing performance and status data. As

shown, the script captures only a few kinds of data on MySQL, but you can easily add

more things to it. You can capture /proc/diskstats to record disk I/O for later analysis

with the pt-diskstats tool, 5 for example.

Getting Accurate Results

The best way to get accurate results is to design your benchmark to answer the question

you want to answer. Have you chosen the right benchmark? Are you capturing the data

you need to answer the question? Are you benchmarking by the wrong criteria? For

example, are you running a CPU-bound benchmark to predict the performance of an

application you know will be I/O-bound?

Next, make sure your benchmark results will be repeatable. Try to ensure that the

system is in the same state at the beginning of each run. If the benchmark is important,

you should reboot between runs. If you need to benchmark on a warmed-up server,

which is the norm, you should also make sure that your warmup is long enough (see

the previous section on how long to run a benchmark), and that it's repeatable. If the

warmup consists of random queries, for example, your benchmark results will not be

repeatable.

If the benchmark changes data or schema, reset it with a fresh snapshot between runs.

Inserting into a table with a thousand rows will not give the same results as inserting

into a table with a million rows! The data fragmentation and layout on disk can also

5. See Chapter 9 for more on the pt-diskstats tool.

Search WWH ::

Custom Search

Home