Database Reference
InDepth Information
from
R
, filters many tuples after the index seek (using the corresponding
predicates in
R
) and then performs a very small number of lookups. As
an example, consider an index request over table
T
with
E
={
(
,
)
}
a
100
,
={
(
,
), (
,
)
}
=
=∅
={
}
R
b
10
c
20
,
P
O
, and
A
d
. A covering index
scans all tuples satisfying the predicates on
a
,
b
, and
c
without requiring RID lookups. Consider the alternative index
I
n
I
c
=
T
(
a
,
b

c
,
d
)
=
)
. In this case, after obtaining tuples from the index we need to
fetch column d from the primary index. The scan over the narrower (and
therefore smaller) index
I
n
, however, might compensate the additional
RID lookups, especially if column
d
is wide.
In this case, assuming independence among predicates, it can be
shown that the optimal index is either that of scenario 1 or has the
same key columns as in scenario 1 followed by a prefix of the columns
in
R
sorted by increasing cardinality (requiring an optional fetch to ob
tain the remaining columns). The best index can therefore be eciently
identified by progressively including new columns from
R
to the included
columns of the index until no further benefit is obtained.
Scenario 3:
O
T
(
a
,
b

c
=∅
.
If the index request contains nonsargable predicates
), the situation is more complex since there can be inter
action between columns. For instance, a predicate
a+b>10
can be
evaluated when we consider an index for other sargable predicates over
columns
a
and
b
. While the main ideas remain the same as for scenario
2 (i.e., we obtain the index that results in the best plan using seeks
followed by optional RID lookups), the details are more complex. If the
number of columns in
R
and
P
is not very large (as is commonly the
case), an exhaustive approach for subsets of such columns is feasible.
If this is not possible, we can use a greedy heuristic leveraging the fact
that cardinality estimation routines would typically use magic numbers
to estimate cardinality values of complex predicates (e.g., the selectivity
of predicate
d*d = 100
might be guessed as 0.1). Although subsets of
columns in
P
are different, they might all have roughly the same selec
tivity and can be greedily picked maximizing the number of predicates
in
P
covered by the index. Finally, note that in a very large number of
cases, choosing the covering index results in either the optimal strat
egy or very close to it, and some techniques heuristically choose such
covering index.
Scenario 4: General case.
Consider now the general case of an index
request
N
(i.e.,
P
=∅
. If the index obtained using scenario 3
produces rows in the desired order
O
, this is the best plan overall.
Otherwise, we need to introduce a sort operator at the root of the plan
obtained with such an index. However, there might be an alternative
plan that does not require sorting and is more ecient. To obtain this
alternative plan, we create an index as follows. The key columns are all
those in
E
followed by those in
O
×
(
T
,
E
,
R
,
P
,
O
,
A
)
−
E
(note that since columns in
E
are
Search WWH ::
Custom Search