Database Reference
In-Depth Information
There are two key differences in these approaches:
•
The time and resources required to gather the object statistics with the first approach are,
in general, much higher. In fact, when gathering the object statistics at the table/index level,
all segments have to be accessed. The same thing happens with partition level statistics
for subpartitioned objects. For example, think about the case of a weekly partitioned table
containing data for several years. If a single partition is changed, then all partitions must be
accessed to update the table/index statistics. Even though only the data of one partition was
modified, all must be accessed.
•
The second approach consumes much fewer resources, but it's able to produce accurate
statistics only at the physical level. That's because it's not possible to derive the number of
distinct values and the histograms from the underlying partitions and subpartitions. By the
way, all other statistics
can
be derived.
Object statistics gathered with the first approach are called
global statistics
. Those gathered with the second
approach are called
derived statistics
(or sometimes
aggregated statistics
). To recognize which type is in place, you
can check whether the
global_stats
column in the data dictionary views listed in Table
8-2
(except for the views
providing detailed information about histograms) is set to
YES
or
NO
. Whenever possible, the
dbms_stats
package
gathers global statistics. It gathers derived statistics only when, for example, the gathering granularity is explicitly
limited to the subpartition level and no object statistics are available at the partition and table/index level.
The following example, based on the output generated by the
global_stats.sql
script, shows a case where, for
a table partitioned by range and subpartitioned by hash, derived statistics aren't accurate. Notice that not only are the
number of distinct values at the table and partition level wrong, but the
global_stats
column is set to
NO
:
•
A gathering at the subpartition level is performed on a table without object statistics (notice
that no sampling is involved):
SQL> BEGIN
2 dbms_stats.
delete_table_stats
(ownname => user,
3 tabname => 't');
4 dbms_stats.
gather_table_stats
(ownname => user,
5 tabname => 't',
6 estimate_percent =>
100
,
6
granularity
=> '
subpartition
');
7 END;
8 /
•
The number of distinct values at the table level), because they were gathered as derived
statistics, is wrong:
SQL> SELECT count(DISTINCT sp)
2 FROM t;
COUNT(DISTINCTSP)
--------------------
100
SQL>
SQL> SELECT num_distinct, global_stats
2 FROM user_tab_col_statistics
3 WHERE table_name = 'T'
4 AND column_name = 'SP';