Databases Reference
In-Depth Information
more slowly! If the wizard was given both the queries and the inserts to analyze it may
have recommended only one or two indexes, perhaps solving the bulk of the issues, for
example, 80 times faster, and incurring a much smaller penalty for the insert processing.
In fact the best tools are designed precisely to evaluate these tradeoffs, but they can only
do so if given a broad view of the workload that the system will encounter.
The need to provide design utilities with a broad view of the workload has some
serious issues of its own. Many real-world workloads include thousands if not hundreds
of thousands of statements executed per day. Some may differ only slightly in their syn-
tax while others may be radically different. This poses a serious problem for the utilities
since the analysis of the workload and the time required to provide recommendations
grow proportionately and sometimes exponentially with the size of the workload, the
number of tables, and the number of predicates and joins, etc. The impact can be dra-
matic, causing the analysis to consume several days of execution time for even a very
modest workload of a few dozen statements.
In an internal test using two different database products, 5 physical design utilities
were compared against the same database data and the same workloads to test the reac-
tion to workload size. The utilities searched for both indexes and materialized views.
The tests used a synthetic workload against a classic MOLAP schema. The fact table
had 8 measures and 16 hierarchical dimensions. In this experiment the design utilities
were initially given 40 statements to analyze, and the workload size was subsequently
increased to 60, 80, 100, and 150 statements. Each time the workload was increased
the design utilities were rerun and the time to generate a recommendation was
recorded. As shown in Figure 12.17 the utility with workload compression showed a
linear increase in recommendation time, scaling from roughly 20 minutes with 40
statements in the workload, to over an hour of execution with a workload of 150 state-
ments. In contrast the utility without workload compression had similar performance
for providing a recommendation with the initial 40-statement workload, but as the
workload size was increased, the recommendation time grew exponentially. With 150
statements, the utility without workload compression required over four days of analy-
sis before returning a result!
There are a number of techniques that vendors use to avoid the explosive growth in
execution time of the wizards. The first is to use an intelligent search algorithm that will
traverse the domain of possible solutions trying to find an optimal or near-optimal solu-
tion without evaluating every possible alternative. Various techniques are used for this,
including random sampling, genetic algorithms, Z-order evaluation, etc. Another key
strategy is to reduce the complexity of the workload being analyzed in a way that will not
significantly reduce the results of the analysis and recommendation. Doing so is called
5 Due to licensing agreements the names of the database products are deliberately omitted. The
tests were courtesy of Scott J. Fadden, IBM Systems & Technology Group, Development GPFS
Performance Engineer.
Search WWH ::




Custom Search