Database Reference
In-Depth Information
the query must also be updated as a side effect. Therefore, it is not true
anymore that adding more indexes would always reduce the expected cost
of a workload. We next briefly explain how the different components in the
top-down approach change when updates are present in the workload.
Choosing the initial configuration: When updates are present, the
initial configuration is not optimal anymore because indexes also need
to be updated, raising the overall workload cost. However, the resulting
configuration is still optimal for the SELECT component of each update
query, and we can use this fact to obtain a lower bound. Specifically, the
expected cost for the SELECT portion of the queries in the workload is
added to the cost of all the UPDATE shells under the base configuration
(which contains only indexes that must be present in any configura-
tion). We then obtain a cost that cannot be improved by any configura-
tion. This bound is not tight (i.e., there might be no configuration that
meets the lower bound) but offers valuable additional information to a
database administrator.
Evaluating the current configuration: Updates can be handled by the
traditional approach of separating an update query into an inner SELECT
and an UPDATE shell. Specifically, we calculate, for each index I and up-
date query q in the workload, the cost of updating I for each update
shell in the workload. Later, when evaluating a configuration, we first
consider the inner SELECT portion of an update query q and then add,
to the resulting partial cost, the update cost of all indexes in the con-
figuration over the table referenced by q .
Ranking candidate configurations: For workloads with updates, the
upper-bound on the cost of a transformed configuration can be negative
(sometimes the cost of removing an index can be outweighed by the
benefit of not having to update it). In those cases the penalty function
correctly chooses transformations with negative over positive cost upper
bounds but sometimes makes poor decisions comparing two transforma-
tions with negative costs. If
T t 1 =− 10,
S t 1 = 10,
T t 2 =− 20, and
S t 2 = 30, then penalty ( C t 1 ) =− 1 is smaller than penalty ( C t 2 ) =− 2 / 3.
However, configuration C t 2 is clearly better than C t 1 in terms of both
space and cost (we say that t 2 dominates t 1 ). To remedy those situa-
tions, we first obtain the skyline of candidate configurations (i.e., we
consider only candidate configurations that are not dominated by any
other transformation) and then use the original penalty definition over
this restricted subset. Also, note that the denominator in the definition
of penalty is given by min
. Since now we can relax
a configuration that requires less than B storage (see above), the de-
nominator might become negative, which is undesirable. However, in
those situations space is not relevant since the configuration already
fits in B . Therefore, when the current configuration requires less than
(
size
(
C
)
B
,
S
)
Search WWH ::




Custom Search