Automated Physical Database Design - Physical Database Design

Databases Reference

In-Depth Information

fit of exploring a much larger set of indexes was found to outweigh the minor

confusion on the relative contribution of benefit of the given indexes to queries

where more than one virtual index was selected by the optimizer.

3.

A time-constrained random variation algorithm was inserted as a postprocess-

ing phase to allow the system to evaluate variations in the solution, if the user-

specified longer evaluation time was acceptable.

Both the Microsoft and IBM papers suggested using cost estimates provided by the

database optimizer as part of the evaluation engine of an index advisor that recommends

table indexes. The database management system (DBMS) is taught to consider “virtual”

indexes within its query compiler, resulting in an effective costing of query perfor-

mance.

The empirical results for this technique were found to be quite good for index

selection leading to production of the features and continued research in expanding the

methods. Just as virtual indexes can be attached, similarly most other physical design

attributes can be simulated and evaluated based on the query compiler's costing estima-

tions. Materialized views, clustering, multidimensional clustering, and partitioning are

all good examples.

The process is far from perfect. It suffers from a number of issues that researchers

have been working to minimize or solve in recent years, including:

1.

Like any model of a physical system, the model used by query compilers for cost-

based query execution plan selection is necessarily incomplete and imperfect. Just

these gaps can lead to suboptimal query execution plan selection during normal

operational processing of queries, similarly they can lead to suboptimal designs

when used for what-if analysis in physical database design.

2.

Physical database design choices can impact the performance of write operation

very significantly (e.g. secondary indexes have a negative impact on INSERT

performance). The challenge here is trading off the benefit of physical design

choices for SELECT queries in the workload against its negative impact for

write operations. This trade-off can be difficult in particular because the relative

priority of query performance versus INSERT, UPDATE, and DELETE per-

formance is not easily known or discovered by the automated design utility.

3.

Query compilers evaluate the cost for query execution plans for one query at a

time. Physical design choices must be made on a global basis, discovering not

what is good or bad for a single query, but rather what is best overall. Often the

best design choice is suboptimal for any single SQL statement, making it

extremely difficult for an optimization process focused on “best” to detect.

4.

Query optimization is not generally a time-consuming process, but if the what-

if analysis were performed for every query in a large workload for every possible

Physical Database Design

Search WWH ::

Custom Search

Home