Databases Reference
In-Depth Information
were created for the table, would it be used and would it improve the performance of
the workload? To discover where additional indexing will be beneficial the definition
of a new index is added to the physical design of the table within the memory of the
system strictly for exploration purposes.
The index is not actually created, but rather the definition of the index along with
associated statistics is simulated within system memory, without any associated data.
When a query is processed the query optimizer will treat the virtual index as though it
were a real index, and include it as query execution plans are considered and evaluated.
From the point of view of the query compiler's cost-based optimization it does not mat-
ter at all that the index does not really exist. The compiler requires information about
the index (such as its key definitions and statistics) but does not require that the index
actually exists on disk. If the resulting estimated cost (calculated by the query compiler)
of the overall workload is lower than the estimated cost without the virtual index, then
the assumption is that the index has been beneficial to the workload. Naturally this
requires a special mode of execution for the query compiler where such virtual objects
can be attached and evaluated without queries actually being executed (i.e., where query
execution plans are evaluated, but data is not accessed and retrieved) similar to the
Explain processing described in Chapter 11.
In 1997 Chaudhuri and Narasayya published a landmark paper where they sug-
gested the use of the database's query optimizer to help automatically select indexes.
They proposed that the value of candidate indexes to the system workload could be esti-
mated by evaluating their impact on the access path cost, as simulated by the query
optimizer. The candidate indexes need not be fully materialized, but simply created in a
virtual sense with enough detail that the query optimizer would perceive as real within
its modeling of the cost and access path selection. Although their paper does not use the
terminology, in essence they suggested that the query optimizer be used as an “evalua-
tion function” for an optimization search.
The notion of an evaluation function is standard terminology in the artificial intel-
ligence (AI) community, and is used to evaluate the relative virtue of points considered
during the search. 3 The idea of reusing the database optimizer's cost estimations for
evaluating the cost benefit of physical design changes is based on the observation that
the query optimizer's cost modeling is sensitive to both the logical and physical design
of a database. Having a model for workload resource consumption allows us to exploit
3 Very commonly used in AI search schemes such as genetic algorithms, simulated annealing, hill
climbing, etc. In many optimization search algorithms the algorithm must have some way of eval-
uating the benefit to attribute to a current possible solution. This process is commonly referred to
as an “evaluation function.” In hill climbing this can be a measure of the slope between the current
position in the search and each of the candidate next steps. In genetic algorithms the evaluation
function measures relative “fitness” of member elements of the population, etc.
Search WWH ::




Custom Search