Databases Reference
In-Depth Information
fit of exploring a much larger set of indexes was found to outweigh the minor
confusion on the relative contribution of benefit of the given indexes to queries
where more than one virtual index was selected by the optimizer.
3.
A time-constrained random variation algorithm was inserted as a postprocess-
ing phase to allow the system to evaluate variations in the solution, if the user-
specified longer evaluation time was acceptable.
Both the Microsoft and IBM papers suggested using cost estimates provided by the
database optimizer as part of the evaluation engine of an index advisor that recommends
table indexes. The database management system (DBMS) is taught to consider “virtual”
indexes within its query compiler, resulting in an effective costing of query perfor-
mance.
The empirical results for this technique were found to be quite good for index
selection leading to production of the features and continued research in expanding the
methods. Just as virtual indexes can be attached, similarly most other physical design
attributes can be simulated and evaluated based on the query compiler's costing estima-
tions. Materialized views, clustering, multidimensional clustering, and partitioning are
all good examples.
The process is far from perfect. It suffers from a number of issues that researchers
have been working to minimize or solve in recent years, including:
1.
Like any model of a physical system, the model used by query compilers for cost-
based query execution plan selection is necessarily incomplete and imperfect. Just
these gaps can lead to suboptimal query execution plan selection during normal
operational processing of queries, similarly they can lead to suboptimal designs
when used for what-if analysis in physical database design.
2.
Physical database design choices can impact the performance of write operation
very significantly (e.g. secondary indexes have a negative impact on INSERT
performance). The challenge here is trading off the benefit of physical design
choices for SELECT queries in the workload against its negative impact for
write operations. This trade-off can be difficult in particular because the relative
priority of query performance versus INSERT, UPDATE, and DELETE per-
formance is not easily known or discovered by the automated design utility.
3.
Query compilers evaluate the cost for query execution plans for one query at a
time. Physical design choices must be made on a global basis, discovering not
what is good or bad for a single query, but rather what is best overall. Often the
best design choice is suboptimal for any single SQL statement, making it
extremely difficult for an optimization process focused on “best” to detect.
4.
Query optimization is not generally a time-consuming process, but if the what-
if analysis were performed for every query in a large workload for every possible
Search WWH ::




Custom Search