Database Reference
In-Depth Information
even a very conservative cost fraction of 2% results in a large fraction
of column subsets being pruned (because many column subsets are not
referenced often enough or are mentioned only in cheap queries). To
further decrease the set of candidates we rank column subsets by some
effectiveness metric. A possible definition of effectiveness for a column
subset is given by the VPC metric, which captures the fraction of the
scanned data in a partition that would be useful in answering queries
( VPC is short for vertical partitioning confidence , and it is discussed
in detail in the references provided in Section 9.6). We then take the
top- k highest-ranked candidate subsets for each table in the workload
and can generate a candidate vertical partitioning. Specifically, the can-
didate vertical partitioning based on column subset C contains C and
the table key as one vertical partition and the remaining columns in
the table (plus the key) as another vertical partition. Every vertical
partitioning candidate therefore consists of exactly two partitions. We
also consider the case where the table is not vertically partitioned. We
next generate nonpartitioned candidate indexes and views as described
in Chapters 4 and 8 and additionally consider horizontally partitioning
each such candidate index and view. For range partitioning, we use the
specific values from query range predicates as boundary points. For hash
partitioning, we select number of partitions such that it is a multiple of
the number of processors and each partition fits into memory. Finally,
for each vertical partitioning candidate, we consider candidate indexes
and views on such vertical partition candidates if they are well defined
(i.e., if the set of columns of the index or view belongs to a single vertical
partition).
Merging. To explore the space of merged structures, we iterate over the
given set of candidates similarly to what we do in Section 8.4.1 in the
context of materialized views. At each iteration, we generate all candi-
dates that can be obtained by merging a pair of current candidates. We
then pick the best merged structures, replace their parent candidates,
and repeat this process. Thus, we return the maximal merged struc-
tures that can be obtained by repeatedly merging pairs of structures,
which are added to the original set of candidates. We next describe
how to merge a pair of physical design structures in the presence of
vertical and horizontal partitioning. First, we generate interesting ver-
tical partitioning candidates by merging vertical partitions that are the
output of the candidate selection step. Merging two partitioning candi-
dates involves manipulating two subsets of columns. There are different
ways to generate merged partitioning candidates, which involve per-
forming unions and intersections of the column subsets of the parent
candidates. Then, for each single vertical partition (including the new
merged ones), we merge all indexes and materialized views that are rele-
vant for that vertical partition while taking horizontal partitioning into
Search WWH ::




Custom Search