Incorporating Other Physical Structures - Automated Physical Database Design and Tuning

Database Reference

In-Depth Information

partitioning scheme with the highest benefit. We additionally define the

neighbors of a configuration C as all configurations that differ with C in

exactly one partitioning function. The enumeration algorithm proceeds

as follows. We keep a priority queue of unexplored configurations, which

we first populate with the initial configuration already described. Then,

until a time limit is reached, we (1) remove from the priority queue the

highest-ranked configuration, (2) perform a what-if optimization to ob-

tain the true benefit value of the configuration as described in Chapter 5,

(3) keep the current configuration if it is the best so far, and (4) insert

in the priority queue all unexplored configurations that are neighbors

of the current one. While we can use the benefit of a configuration C as

the ranking function in the priority queue, a refined formulation gives

better empirical results. In addition to the benefit of a configuration,

this refined rank of a configuration gives priority to partitions of large

tables and penalizes small incremental changes. Section 9.6 discusses

references that provide additional details on this ranking function as

well as various optimizations to the basic scheme.

9.1.2.2

Integrated Recommendation of Partitioning

Recommending horizontal partitioning in isolation makes sense in a parallel

DBMS, where the biggest performance overhead results from data movement

across servers. In traditional centralized settings, however, it is much more

important to integrate the recommendation of partitioning (both horizontal

and vertical) with that of other physical structures, like indexes and ma-

terialized views. The reason is that different aspects of physical design can

strongly interact. Experimental evaluation in the literature compares the best

recommendation obtained by two alternative procedures. The first one se-

lects the best (nonpartitioned) indexes and subsequently recommends how to

best horizontally partition them. The second alternative considers indexes and

horizontal partitioning together and returns the best recommendation overall.

Even for simple TPC-H queries, it has been observed that the execution times

under the recommendation produced by the integrated alternative can be sig-

nificantly faster than those under the alternative that recommends physical

structures in isolation. We next describe how we can recommend partitioning,

indexes, and materialized views in an integrated way.

Candidate generation. Generating candidates for vertical partitions is

complex because there are many different ways to divide the columns

of each table. To avoid a combinatorial explosion of candidates, we ap-

ply a pruning technique similar to that in Section 8.4.1 in the context

of materialized views. Specifically, the idea is to consider a subset of

columns C for partitioning only if the fraction of the cost of all queries

in the workload that reference C is above some threshold (and there-

fore C is deemed interesting). It has been shown experimentally that

Automated Physical Database Design and Tuning

Search WWH ::

Custom Search

Home