Database Reference
In-Depth Information
Notice that for values 101 and 102, the frequency is the lowest frequency of all values represented in the
histogram divided by two. Be aware that it is 32, and not 34 (68/2) as you might expect, because not all values are
represented in the histogram.
Let's have a look at what happens when values not represented in the histogram are used. Simply put, it's the
same as for the frequency histograms. The following example illustrates:
SQL> EXPLAIN PLAN SET STATEMENT_ID '096' FOR SELECT * FROM t WHERE val3 = 96;
SQL> EXPLAIN PLAN SET STATEMENT_ID '098' FOR SELECT * FROM t WHERE val3 = 98;
SQL> EXPLAIN PLAN SET STATEMENT_ID '100' FOR SELECT * FROM t WHERE val3 = 100;
SQL> EXPLAIN PLAN SET STATEMENT_ID '103.5' FOR SELECT * FROM t WHERE val3 = 103.5;
SQL> EXPLAIN PLAN SET STATEMENT_ID '107' FOR SELECT * FROM t WHERE val3 = 107;
SQL> EXPLAIN PLAN SET STATEMENT_ID '109' FOR SELECT * FROM t WHERE val3 = 109;
SQL> EXPLAIN PLAN SET STATEMENT_ID '111' FOR SELECT * FROM t WHERE val3 = 111;
SQL> SELECT statement_id, cardinality
2 FROM plan_table
3 WHERE id = 0
4 ORDER BY statement_id;
STATEMENT_ID CARDINALITY
------------ -----------
096 1
098 13
100 26
103.5 32
107 26
109 13
111 1
If the rule based on Formula 8-1 isn't fulfilled, and therefore, neither a frequency histogram nor a top frequency
histogram can be built, the database engine creates a hybrid histogram.
Hybrid Histograms
Hybrid histograms combine some of the characteristics of both frequency and height-balanced histograms. The
process to build them starts in the same way as for height-balanced histograms. Then two important improvements
take place:
Every distinct value is associated to a single bucket (in other words, the concept of popular
value as defined for height-balanced histograms no longer exists). For that purpose, the
bucket's limits are shifted. As a result, every bucket may be based on a different number of rows.
A frequency is added to the endpoint value of every bucket. Hence, for the endpoint values,
and only for the endpoint values, a kind of frequency histogram is available.
The test table has two hybrid histograms. As an example, let's have a look at the one created for the val1 column
(note that it has 22 distinct values). The output of the following query shows the data set it contains:
SQL> SELECT val1, count(*), ratio_to_report(count(*)) OVER ()*100 AS percent
2 FROM t
3 GROUP BY val1
 
Search WWH ::




Custom Search