Database Reference
In-Depth Information
the query must also be updated as a side effect. Therefore, it is not true
anymore that adding more indexes would always reduce the expected cost
of a workload. We next briefly explain how the different components in the
top-down approach change when updates are present in the workload.
Choosing the initial configuration:
When updates are present, the
initial configuration is not optimal anymore because indexes also need
to be updated, raising the overall workload cost. However, the resulting
configuration is still optimal for the
SELECT
component of each update
query, and we can use this fact to obtain a lower bound. Specifically, the
expected cost for the
SELECT
portion of the queries in the workload is
added to the cost of all the
UPDATE
shells under the base configuration
(which contains only indexes that must be present in any configura-
tion). We then obtain a cost that cannot be improved by any configura-
tion. This bound is not tight (i.e., there might be no configuration that
meets the lower bound) but offers valuable additional information to a
database administrator.
Evaluating the current configuration:
Updates can be handled by the
traditional approach of separating an update query into an inner
SELECT
and an
UPDATE
shell. Specifically, we calculate, for each index
I
and up-
date query
q
in the workload, the cost of updating
I
for each update
shell in the workload. Later, when evaluating a configuration, we first
consider the inner
SELECT
portion of an update query
q
and then add,
to the resulting partial cost, the update cost of all indexes in the con-
figuration over the table referenced by
q
.
Ranking candidate configurations:
For workloads with updates, the
upper-bound on the cost of a transformed configuration can be negative
(sometimes the cost of removing an index can be outweighed by the
benefit of not having to update it). In those cases the penalty function
correctly chooses transformations with negative over positive cost upper
bounds but sometimes makes poor decisions comparing two transforma-
tions with negative costs. If
T
t
1
=−
10,
S
t
1
=
10,
T
t
2
=−
20, and
S
t
2
=
30, then
penalty
(
C
t
1
)
=−
1 is smaller than
penalty
(
C
t
2
)
=−
2
/
3.
However, configuration
C
t
2
is clearly better than
C
t
1
in terms of both
space and cost (we say that
t
2
dominates
t
1
). To remedy those situa-
tions, we first obtain the skyline of candidate configurations (i.e., we
consider only candidate configurations that are not dominated by any
other transformation) and then use the original
penalty
definition over
this restricted subset. Also, note that the denominator in the definition
of
penalty
is given by min
. Since now we can relax
a configuration that requires less than
B
storage (see above), the de-
nominator might become negative, which is undesirable. However, in
those situations space is not relevant since the configuration already
fits in
B
. Therefore, when the current configuration requires less than
(
size
(
C
)
−
B
,
S
)
Search WWH ::
Custom Search