Indexing for High Performance - High Performance MySQL

Databases Reference

In-Depth Information

Summary

As you can see, indexing is a complex topic! The way MySQL and the storage engines

access data, combined with the properties of indexes, make indexes a very powerful

and flexible tool for influencing data access, both on disk and in memory.

Most of the time you'll use B-Tree indexes with MySQL. The other types of indexes

are rather more suitable for special purposes, and it will generally be obvious when you

ought to use them and how they can improve query response times. We'll say no more

about them in this chapter, but it's worth wrapping up with a review of the properties

and uses of B-Tree indexes.

Here are three principles to keep in mind as you choose indexes and write queries to

take advantage of them:

1. Single-row access is slow, especially on spindle-based storage. (Solid-state disks

are faster at random I/O, but this point remains true.) If the server reads a block

of data from storage and then accesses only one row in it, it wastes a lot of work.

It's much better to read in a block that contains lots of rows you need. Use indexes

to create locality of reference for improved efficiency.

2. Accessing ranges of rows in order is fast, for two reasons. First, sequential I/O

doesn't require disk seeks, so it is faster than random I/O, especially on spindle-

based storage. Secondly, if the server can read the data in the order you need it, it

doesn't need to perform any follow-up work to sort it, and GROUP BY queries don't

need to sort and group rows together to compute aggregates over them.

3. Index-only access is fast. If an index contains all the columns that the query needs,

the storage engine doesn't need to find the other columns by looking up rows in

the table. This avoids lots of single-row access, which as we know from point 1

above is slow.

In sum, try to choose indexes and write queries so that you can avoid single-row look-

ups, use the inherent ordering of the data to avoid sorting operations, and exploit index-

only access. This corresponds to the three-star ranking system set out in Lahdenmaki

and Leach's topic, mentioned at the beginning of this chapter.

It would be great to be able to create perfect indexes for every query against your tables.

Unfortunately, sometimes this would require an impractically large number of indexes,

and at other times there simply is no way to create a three-star index for a given query

(for example, if the query orders by two columns, one ascending and the other de-

scending). In these cases you have to settle for the best you can do, or pursue alternative

strategies, such as denormalization or summary tables.

It's very important to be able to reason through how indexes work, and to choose them

based on that understanding, not on rules of thumb or heuristics such as “place the

most selective columns first in multicolumn indexes” or “you should index all of the

columns that appear in the WHERE clause.”

Search WWH ::

Custom Search

Home