Query Optimization and Plan Selection - Physical Database Design

Databases Reference

In-Depth Information

S(A = a ) = 1/card A (R).

3.1

For example, if the table has 50 records, then the selectivity is 1/50 or 0.02.

On the other hand, if attribute A is not a primary key and has multiple occurrences

for each value a , then we can also use Equation 3.1 to estimate the selectivity, but we

must acknowledge that we are guessing that the distribution of values is uniform. Some-

times this is a poor estimate, but generally it is all we can do without actual distribution

data to draw upon. For example, if there are 25 cities out of 200 suppliers in the sup-

plier table in Example Query 3.1, then the number of records with 'NY' is estimated to

be card city (supplier) = 200/25 = 8. The selectivity of 'NY' is 1/card city (supplier) = 1/8 =

0.125. In reality, the number of records was given in the example to be 10%, so in this

case our estimate is pretty good, but it is not always true.

The selectivity of an attribute A being greater than (or less than) a specific value “ a ”

also depends on a uniform distribution (random probability) assumption for our esti-

mation:

S(A > a ) = (max A (R) - a )/(max A (R) - min A (R)).

3.2

S(A < a ) = ( a - min A (R))/(max A (R) - min A (R)).

3.3

The selectivity of two intersected selection operations (predicates) on the same

table can be estimated exactly if the individual selectivities are known:

×

S(P and Q) = S(P)

S(Q),

3.4

where P and Q are predicates.

So if we have the query

SELECT city, qty

FROM shipment

WHERE city = 'London'

AND qty = 1000;

where P is the predicate city = 'London' and Q is the predicate qty = 1000, and we

know that

S(city = 'London') = .3, and

S(qty = 1000) = .6, then the selectivity of the entire query,

S(city = 'London' OR 'qty = 1000') = .3

×

.6 = .18.

Search WWH ::

Custom Search

Home