Databases Reference
In-Depth Information
The moral of this little story is that rules of thumb and heuristics can be useful, but you
have to be careful not to assume that average-case performance is representative of
special-case performance. Special cases can wreck performance for the whole
application.
In the end, although the rule of thumb about selectivity and cardinality is interesting
to explore, other factors—such as sorting, grouping, and the presence of range condi-
tions in the query's WHERE clause—can make a much bigger difference to query
performance.
Clustered Indexes
Clustered indexes 6 aren't a separate type of index. Rather, they're an approach to data
storage. The exact details vary between implementations, but InnoDB's clustered in-
dexes actually store a B-Tree index and the rows together in the same structure.
When a table has a clustered index, its rows are actually stored in the index's leaf pages.
The term “clustered” refers to the fact that rows with adjacent key values are stored
close to each other. 7 You can have only one clustered index per table, because you can't
store the rows in two places at once. (However, covering indexes let you emulate mul-
tiple clustered indexes; more on this later.)
Because storage engines are responsible for implementing indexes, not all storage en-
gines support clustered indexes. We focus on InnoDB in this section, but the principles
we discuss are likely to be at least partially true for any storage engine that supports
clustered indexes now or in the future.
Figure 5-3 shows how records are laid out in a clustered index. Notice that the leaf
pages contain full rows but the node pages contain only the indexed columns. In this
case, the indexed column contains integer values.
Some database servers let you choose which index to cluster, but none of MySQL's
built-in storage engines does at the time of this writing. InnoDB clusters the data by
the primary key. That means that the “indexed column” in Figure 5-3 is the primary
key column.
If you don't define a primary key, InnoDB will try to use a unique nonnullable index
instead. If there's no such index, InnoDB will define a hidden primary key for you and
then cluster on that. InnoDB clusters records together only within a page. Pages with
adjacent key values might be distant from each other.
A clustering primary key can help performance, but it can also cause serious perfor-
mance problems. Thus, you should think carefully about clustering, especially when
you change a table's storage engine from InnoDB to something else (or vice versa).
6. Oracle users will be familiar with the term “index-organized table,” which means the same thing.
7. This isn't always true, as you'll see in a moment.
 
Search WWH ::




Custom Search