Databases Reference
In-Depth Information
Figure 12.17
Time for analysis with and without workload compression.
“workload compression.” None of the existing techniques today actually modify the
input queries/statements. Rather, they choose a subset of the given workload for tuning.
Chaudhuri et al. [2002] propose a different strategy based on analyzing the work-
load to detect classes of statements, where a statement can be removed from the work-
load if another statement exists in the workload that is considered sufficiently similar
based on a “distance” function. Statements are first partitioned by the tables that are
accessed and the join columns. Statements with different tables and join columns are
considered to be infinitely distant and reside in different classes. Within a class, the joint
selectivity of the predicates is compared to estimate how similar queries within the same
are to one another. A predetermined loss constraint is defined (e.g., 10%) and state-
ments are dropped from the classes with the aim of removing the maximum number of
statements while respecting the loss constraint. This description is only a summary, and
the reader is referred to the original paper for the details.
Two major techniques for workload compression are used by modern physical data-
base design utilities and published in the literature. One technique [Zilio et al. 2004]
uses an approach based on focusing on the most complex and costly statements in a
workload and ignoring the others. The rationale of this approach is that the gains that
Search WWH ::




Custom Search