Database Reference
In-Depth Information
(part, supplier, customer)
6 M
(ALL, supplier, customer)
5.5 M
(part, supplier, ALL)
800 K
(part, ALL, customer)
5.8 M
(ALL, ALL, customer)
100 K
(ALL, supplier, ALL)
10 K
(part, ALL, ALL)
200 K
(ALL, ALL, ALL)
1
FIGURE 9.2
Lattice of subcubes.
9.2.1.2
Answering Data Cube Queries
Suppose that we materialize all cells in the result of query (part, supplier,
customer) as a new table. This table might generally be different from the
raw table because all duplicate values of part , supplier , and customer are
aggregated in a single cell. We can answer any query by scanning this re-
sult and performing optional aggregation and filtering. For instance, consider
again query (part, ALL, c) . We can answer this query by scanning the ma-
terialized result of (part, supplier, customer) , keeping only the tuples
that satisfy customer=c and aggregating the resulting sales values grouped
by part .
Suppose now that we materialize the result of query (part, ALL,
customer) . In this case, each cell in this materialized subcube contains the
total amount of sales for each combination of part and customer over all
possible suppliers . Answering the original query (part, ALL, c) can be
done more eciently over the result of (part, ALL, customer) because this
table is potentially much smaller than (part, supplier, customer) due
to preaggregation (additionally, since the table is already preaggregated, the
query involves only a scan but no additional grouping and aggregation).
This example hints at the fact that we can use the result of some
query (materialized as a subcube) to speed up the processing of other
queries. For a cube with n dimensions, we can materialize 2 n differ-
ent subcubes, by replacing each subset of attributes with the value ALL .
Figure 9.2 shows an example of all subcubes for our running example, with the
number of cells in each subcube. The subcubes are laid out as a lattice, show-
ing which subcubes can be completely computed from others. For instance,
consider (ALL, ALL, customer) . We can compute this subcube by scanning
either (ALL, supplier, customer) or (part, ALL, customer) and subse-
quently grouping on customer. Note that if both alternatives are available,
it would be cheaper to do the processing over (ALL, supplier, customer)
Search WWH ::




Custom Search