Object Statistics - Troubleshooting Oracle Performance

Database Reference

In-Depth Information

Extended Statistics

The column statistics and histograms described in the previous sections are helpful only when column values are

used without being modified in predicates. For example, if the predicate country='Switzerland' is used, with

column statistics and a histogram in place for the country column, the query optimizer should be able to correctly

estimate its selectivity. This is because column statistics and the histogram describe the values of the country

column itself. On the other hand, if the predicate upper(country)='SWITZERLAND' is used, the query optimizer

is no longer able to directly infer the selectivity from the object statistics and the histogram. A similar problem

occurs when a predicate references several columns. For example, if I apply the predicate country='Denmark' AND

language='Danish' to a table containing people from all over the world, it's likely that the two restrictions apply

to the same rows for most rows in such a table. In fact, most people speaking Danish live in Denmark, and most

people living in Denmark speak Danish. In other words, the two restrictions are almost redundant. Such columns

are commonly called correlated columns and challenge the query optimizer. This is because no object statistics or

histograms describe such a dependency between data—or put another way, the query optimizer actually assumes

that data stored in different columns isn't interdependent.

As of version 11.1, it's possible to gather object statistics and histograms on expressions or on groups of columns

to solve these kinds of problems. These new statistics are called extended statistics . Basically, what happens is that a

hidden column, called an extension , is created, based on either an expression or a group of columns. Then regular

object statistics and histograms are gathered on it.

The definition is carried out with the create_extended_stats function of the dbms_stats package. For example,

two extensions are created with the following query. The first one is on upper(pad) , and the second one is a column

group made up of the columns val2 and val3 . In the test table, these contain exactly the same values; in other words,

the columns are highly (actually, perfectly) correlated. For the definition, as shown next, the expression or group

of columns must be enclosed in parentheses. Notice that the function returns a system-generated name for the

extension (a 30-byte name starting with SYS_STU ):

SQL> SELECT dbms_stats.create_extended_stats(ownname => user,

2 tabname => 'T',

3 extension => '(upper(pad))') AS ext1,

4 dbms_stats.create_extended_stats(ownname => user,

5 tabname => 'T',

6 extension => '(val2,val3)') AS ext2

7 FROM dual;

EXT1 EXT2

------------------------------ ------------------------------

SYS_STU0KSQX64#I01CKJ5FPGFK3W9 SYS_STUPS77EFBJCOTDFMHM8CHP7Q1

■

Note

A group of columns can't reference expressions or virtual columns.

Obviously, once the extensions are created, the data dictionary provides information about them. The following

query, based on the user_stat_extensions view, shows the existing extensions of the test table. There are dba , all ,

and, in a 12.1 multitenant environment, cdb versions as well:

SQL> SELECT extension_name, extension

2 FROM user_stat_extensions

3 WHERE table_name = 'T';

Troubleshooting Oracle Performance

Search WWH ::

Custom Search

Home