Databases Reference
In-Depth Information
Going back to the histogram,
EQ_ROWS
is the estimated number of rows whose column
value equals
RANGE_HI_KEY
. So, in our example, for the
RANGE_HI_KEY
value of 831,
EQ_ROWS
shows 198, which we know is also the actual number of existing records for
ProductID
831.
RANGE_ROWS
is the estimated number of rows whose column value falls inside the range
of the step,
excluding
the upper boundary. In our example, this is the number of records
with values from 827 to 830 (831, the upper boundary or
RANGE_HI_KEY
, is excluded).
The histogram shows 110 records and we could obtain the same value by getting the
sum of 31 records for
ProductID
827, 46 records for
ProductID
828, 0 records for
ProductID
829, and 33 records for
ProductID
830.
DISTINCT_RANGE_ROWS
is the estimated number of rows with a distinct column value
inside this range, once again excluding the upper boundary. In our example, we have
records for three distinct values: 827, 828, and 830, so
DISTINCT_RANGE_ROWS
is 3.
There are no records for
ProductID
829, and 831, which is the upper boundary, is again
excluded.
Finally,
AVG_RANGE_ROWS
is the average number of rows per distinct value,
excluding the upper boundary, and it is simply calculated as
RANGE_ROWS
/
DISTINCT_
RANGE_ROWS
. In our example, we have a total of 110 records for 3
DISTINCT_RANGE_
ROWS
, which gives us 110 / 3 = 36.6667, also shown in the second step of the histogram
shown previously.
Now let's see how the histogram is used to estimate the selectivity of some queries.
Let us see the first query:
SELECT
*
FROM
Sales
.
SalesOrderDetail
WHERE
ProductID
=
831
Listing 3-15.