Physical Data Warehouse Design - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

table has a quantity value equal to the maximum in the view so the maximum

value remains the same and the counter is increased to 7. The tuple for p4 in

the summary-delta table has a maximum value greater than the maximum

in the view, and thus, the view must be updated with the new maximum and

the new counter.

Now consider the deletions. The tuple for p5 in the summary-delta table

has a quantity value smaller than the maximum in the view so the view is not

modified. The tuple for p6 in the summary-delta table has a quantity value

equal to the maximum in the view but with a greater count value. In this

case, we decrease the counter in the view to 1. The tuple for p7 illustrates

why the MAX function is not self-maintainable with respect to deletions.

The maximum value and the counter in the summary-delta table are equal

to those value in the view. There are two possible cases. If there are other

tuples in the base table with the same combination ( p7 , t7 ), we must obtain

the new maximum value and the new count from the base tables. This case

is depicted in Fig. 7.5 c. Otherwise, if there are no other tuples in the base

table with the same combination ( p7 , t7 ), we must simply delete the tuple

from the view.

The algorithm for refreshing the view DailySalesMax from the summary-

delta table SD DailySalesMax is left as an exercise.

7.4 Computation of a Data Cube

In Chap. 5 , we have explained how the data cube could be computed by

means of an SQL query, where the all value is represented by the null value.

Computing the whole data cube in this way from the base fact and dimension

tables could become extremely hard unless an adequate strategy is applied.

The simplest method, consisting in performing the GROUP BY queries for

each view and then taking their UNION , would be unacceptable in real-life

applications. Thus, several optimization techniques have been proposed for

this. We study next some of them in order to convey the main idea.

The optimization methods start with the notion of data cube lattice .

In this lattice, each node represents a possible aggregation of the fact data,

where there is an edge from node i to node j if j can be computed from i and

the number of grouping attributes of i is the number of attributes of j plus

one. For instance, given an aggregate view by CustomerKey and ProductKey

of the Sales table of the previous section, we can compute the total sales

amount by customer directly from this view, without computing it from the

base table. In what follows, to avoid overloading figures, we will work with the

lattice depicted in Fig. 7.6 , corresponding to a four-dimensional data cube,

with dimensions A , B , C ,and D . In this lattice, an edge from ABC to AB means

that the summary table AB can be computed from ABC . We do not include

in the lattice the transitive edges, for example, edges like ABCD

→

AB .

Data Warehouse Systems: Design and Implementation

Search WWH ::

Custom Search

Home