Maintaining Indexes - Expert Indexing in Oracle Database 11g

Databases Reference

In-Depth Information

Gathering Statistics During a Rebuild

You can gather statistics on an index while rebuilding the index, as the following example shows:

SQL> alter index hr.emp_emp_id_pk rebuild compute statistics;

Index altered.

You save time by having the database gather statistics while it's rebuilding the index.

The METHOD_OPT Parameter

A parameter named METHOD_OPT gives you control over statistics collection as it relates to columns. You

specify the METHOD_OPT parameter of the DBMS_STATS package in order to tell the database two things:

•

The columns for which it should collect statistics.

•

If it should collect a histogram for any column, and if so, how many buckets

should be in the histogram.

Often, DBAs specify the for all indexed columns size auto value for the METHOD_OPT parameter

when collecting table statistics. Greg Rahn shows why this may be really a bad idea in most cases in his

method_ opt-and-for-all-indexed_columns/ ).

Specifying the value for all indexed columns size auto for the METHOD_OPT parameter has the

following implications:

•

It tells the database to collect statistics only for those columns that are indexed.

•

It lets the database determine if it should collect histograms, as well as the number

of buckets it should allocate for a histogram.

When you specify the for all indexed columns size auto option, the database collects no statistics

on the unindexed columns; it simply uses default values for the number of default values and

cardinality. As a result, it may often end up with a vastly underestimated number of rows. When you get

an explain plan with the help of the DBMS_XPLAN.DISPLAY_CURSOR procedure (run with the ALLSTATS LAST

option), it'll show that the number of estimated rows (E-rows) is often underestimated when compared

to the number of actual rows (A-rows). The wrong cardinality estimates are very likely to play havoc with

the selection of the access paths, join methods, and join order, leading to poor SQL execution times. The

whole idea behind collecting optimizer statistics is to collect representative statistics. The database

doesn't really know your data; you do! (If you really want to understand how to collect good statistics,

please read the excellent article on the use of the DBMS_STATS package by Karen Morton titled “Managing

Statistics for Optimal Query Performance,” available on the method-r.com web site)

The cost optimizer is very likely to produce suboptimal execution plans when it isn't fed accurate

statistics. Inaccurate cardinality estimates are a good example of the consequences of the database

collecting nonrepresentative statistics. For example, a wrong cardinality estimate can lead to the

optimizer selecting the wrong driving table. Alternatively, the optimizer may decide that a NL join is

better when a hash join would be more appropriate, especially when dealing with large data sets.

Search WWH ::

Custom Search

Home