Databases Reference
In-Depth Information
S(A = a ) = 1/card A (R).
3.1
For example, if the table has 50 records, then the selectivity is 1/50 or 0.02.
On the other hand, if attribute A is not a primary key and has multiple occurrences
for each value a , then we can also use Equation 3.1 to estimate the selectivity, but we
must acknowledge that we are guessing that the distribution of values is uniform. Some-
times this is a poor estimate, but generally it is all we can do without actual distribution
data to draw upon. For example, if there are 25 cities out of 200 suppliers in the sup-
plier table in Example Query 3.1, then the number of records with 'NY' is estimated to
be card city (supplier) = 200/25 = 8. The selectivity of 'NY' is 1/card city (supplier) = 1/8 =
0.125. In reality, the number of records was given in the example to be 10%, so in this
case our estimate is pretty good, but it is not always true.
The selectivity of an attribute A being greater than (or less than) a specific value “ a
also depends on a uniform distribution (random probability) assumption for our esti-
mation:
S(A > a ) = (max A (R) - a )/(max A (R) - min A (R)).
3.2
S(A < a ) = ( a - min A (R))/(max A (R) - min A (R)).
3.3
The selectivity of two intersected selection operations (predicates) on the same
table can be estimated exactly if the individual selectivities are known:
×
S(P and Q) = S(P)
S(Q),
3.4
where P and Q are predicates.
So if we have the query
SELECT city, qty
FROM shipment
WHERE city = 'London'
AND qty = 1000;
where P is the predicate city = 'London' and Q is the predicate qty = 1000, and we
know that
S(city = 'London') = .3, and
S(qty = 1000) = .6, then the selectivity of the entire query,
S(city = 'London' OR 'qty = 1000') = .3
×
.6 = .18.
Search WWH ::




Custom Search