Object Statistics - Troubleshooting Oracle Performance

Database Reference

In-Depth Information

12 AND column_name = 'VAL1'

13 /

LOW_VALUE HIGH_VALUE

--------- ----------

39 63

density is a decimal number between 0 and 1. Values close to 0 indicate that a restriction on

that column filters out the majority of the rows. Values close to 1 indicate that a restriction on

that column filters almost no rows. If no histogram is present, density is 1/num_distinct . If

a histogram is present, the computation differs and depends on the type of histogram. In any

case, as of version 10.2.0.4, for columns with histograms, this value is only used for backward

compatibility when the optimizer_features_enable initialization parameter is set to an older

release.

•

num_nulls is the number of NULL values stored in the column.

•

avg_col_len is the average column size in bytes.

•

histogram indicates whether a histogram is available for the column and, if it's available,

which type it is. Valid values are NONE (meaning no histogram), FREQUENCY , HEIGHT BALANCED ,

and, as of version 12.1, TOP-FREQUENCY and HYBRID .

•

num_buckets is the number of buckets in the histogram. A bucket, or category as it's called in

statistics, is a group of values of the same kind. As described in the next section, histograms

are composed of at least one bucket. If no histogram is available, it's set to 1. The maximum

number of buckets is 254 up to version 11.2, and 2,048 as of version 12.1.

Histograms

The query optimizer starts from the principle that data is uniformly distributed. An example of a uniformly distributed

set of data is the one stored in the id column in the test table used throughout the previous sections. In fact, it stores all

integers from 1 up to 1,000 exactly once. In such a case, to produce a good estimate of the number of rows filtered out

by a predicate based on that column (for example, id BETWEEN 6 AND 19 ), the query optimizer requires only the object

statistics described in the preceding section: the minimum value, the maximum value, and the number of distinct values.

If data isn't uniformly distributed, the query optimizer can't compute acceptable estimations without additional

information. For example, given the data set stored in the val2 column (see the output of the following query), how

could the query optimizer make a meaningful estimation for a predicate like val2=105 ? It can't, because it has no clue

that about 50 percent of the rows fulfill that predicate:

SQL> SELECT val2, count(*)

2 FROM t

3 GROUP BY val2

4 ORDER BY val2;

VAL2 COUNT(*)

---------- ----------

101 8

102 25

103 68

104 185

Troubleshooting Oracle Performance

Search WWH ::

Custom Search

Home