Database Reference
In-Depth Information
value of the leading column. The following query shows such an example. Notice the index_ss hint and the
INDEX SKIP SCAN operation:
SELECT /*+ index_ss (t i_n123) */ * FROM t WHERE n2 = 42 AND n3 = 11
--------------------------------------------------------------------------
| Id | Operation | Name | Starts | A-Rows | Buffers |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 33 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2 | 33 |
|* 2 | INDEX SKIP SCAN | I_N123 | 1 | 2 | 31 |
--------------------------------------------------------------------------
2 - access("N2"=42 AND "N3"=11)
filter(("N2"=42 AND "N3"=11))
Because descending index skip scans are supported for “regular” index scans (using the INDEX SKIP SCAN
DESCENDING operation), you can use the two hints index_ss_asc and index_ss_desc to control the order of the scan.
Speaking of composite indexes, I believe it's necessary to mention the most common mistake I come across
when dealing with them, as well as the most frequently asked question. The mistake is related to overindexation. The
misconception is that the database engine is able to take advantage of an index only when all columns comprising the
index key are used in the WHERE clause. As you've just seen in several examples, that's not the case. This misconception
usually leads to the creation of several indexes on the same table, with the same leading column—for instance, one
index with the n1 , n2 , and n3 columns and another with the n1 and n3 columns. The second is generally superfluous.
Note that superfluous indexes are a problem because not only do they slow down SQL statements that modify the
indexed data, but also because they waste space unnecessarily.
The most frequently asked question is: how do I choose the order of the columns? For example, if an index key
is composed of the n1 , n2 , and n3 columns, what is the best order? When all indexed columns are present in a WHERE
clause, the efficiency of the index is independent of the order of the columns in the index. Therefore, the best order
is the one that maximizes the chances of using the index as frequently as possible when not all indexed columns
are present in WHERE clauses. In other words, it should be possible to use an index for the greatest number of SQL
statements. To make sure this is the case, the columns should be ordered according to their frequency of utilization.
Especially the leading column should be the one that's more frequently (ideally speaking, of course) specified in
WHERE clauses. Whenever several columns are used with equal frequency, there are two opposing approaches you
can follow:
The leading column should be the one that's expected to provide the best selectivity. If only
equalities are used, this is the column with the highest number of distinct values. If range
conditions are used, the number of distinct values itself might not be relevant. For example,
think about the case of a timestamp: it's likely that the number of distinct values is high.
But because a timestamp is frequently used in range predicates, what matters is the actual
selectivity and not the number of distinct values. Having a leading column that can provide
a strong selectivity is useful if a restriction is applied only on that particular column in future
SQL statements. In other words, you maximize the chances that the index could be selected by
the query optimizer.
The leading column should be the one with the lowest number of distinct values. This could
be useful to achieve a better compression ratio for the index.
 
Search WWH ::




Custom Search