Database Reference
In-Depth Information
Q2 = SELECT A, B, D
FROM R
WHERE 50<A<100
In this case, the best index for Q1 is I 1 =R(A|B,C) and the best index for Q2
is I 2 =R(A|B,D) . In the absence of storage constraints, obviously the best con-
figuration is
. Now, suppose that the space taken by I 1 and I 2 together
exceeds the storage budget. If the candidate set is
{
I 1 ,
I 2 }
, the only alternative
is to pick either I 1 or I 2 as the final configuration. In such scenario, either Q1
or Q2 would result in optimal performance, but the performance of the other
query would not improve much. For instance, suppose that we pick I 1 . The
best plan for Q2 would be either the original one (which scans R and filters
tuples based on the predicate on A ) or an alternative one that uses I 1 to ob-
tain columns A and B and then performs RID lookups to the primary index
for each tuple in the result to obtain column D . Clearly, if many tuples satisfy
50<A<100 , the index-based alternative would be more expensive than the
original scan, and Q2 would not benefit at all from I 1 .
Consider now index I 3 =R(A|B,C,D) . This index is almost as good as I 1 and
I 2 for queries Q1 and Q2 . Both queries can seek I 3 to obtain tuples that satisfy
predicates on A , and the index covers all columns required by both Q1 and Q2 .
Since I 3 includes one more column than either I 1 or I 2 , seeking I 3 would be
slightly more expensive than doing so for the narrower indexes. However, I 3 is
smaller than I 1 and I 2 combined and therefore could fit in the storage budget,
and the overall performance for both Q1 and Q2 , while not optimal, could be
better than the alternatives discussed before (which would pick either I 1 or
I 2 ). In this case, an index that was not optimal for either Q1 or Q2 is part of
the optimal configuration for the workload
{
I 1 ,
I 2 }
{ Q1,Q2 }
.
4.2.1.2
Updates
Updates make the physical design problem harder. The reason is that indexes
that are very useful in the absence of updates can lose all their benefits when
we additionally consider maintenance costs. Consider the following UPDATE
query:
Q3 = UPDATE R
SETD=D+1
WHEREB<5
Updates are composed of an inner SELECT query, which determines which tu-
ples to update, and an UPDATE shell, which performs the actual update. For
that reason, the techniques in the previous section can be used on the inner
SELECT query to obtain indexes that can improve the performance of the up-
date. For Q3 , index I 4 = R(B) would be optimal to obtain the RIDs of all
tuples in R that satisfy B<5 and thus need to be updated.
Search WWH ::




Custom Search