Database Reference
In-Depth Information
￿ Distributive measures are defined by an aggregation function that can
be computed in a distributed way. Suppose that the data are partitioned
into n sets and that the aggregate function is applied to each set,
giving n aggregated values. The function is distributive if the result of
applying it to the whole data set is the same as the result of applying a
function (not necessarily the same) to the n aggregated values. The usual
aggregation functions such as the count, sum, minimum, and maximum are
distributive. However, the distinct count function is not. For instance, if we
partition the set of measure values
{
3 , 3 , 4 , 5 , 8 , 4 , 7 , 3 , 8
}
into the subsets
, summing up the result of the distinct count
function applied to each subset gives us a result of 8, while the answer over
the original set is 5.
￿ Algebraic measures are defined by an aggregation function that can be
expressed as a scalar function of distributive ones. A typical example of an
algebraic aggregation function is the average, which can be computed by
dividing the sum by the count, the latter two functions being distributive.
￿ Holistic measures are measures that cannot be computed from other
subaggregates. Typical examples include the median, the mode, and the
rank. Holistic measures are expensive to compute, especially when data
are modified, since they must be computed from scratch.
{
3 , 3 , 4
}
,
{
5 , 8 , 4
}
,and
{
7 , 3 , 8
}
3.2 OLAP Operations
As already said, a fundamental characteristic of the multidimensional model
is that it allows one to view data from multiple perspectives and at several
levels of detail. The OLAP operations allow these perspectives and levels of
detail to be materialized by exploiting the dimensions and their hierarchies,
thus providing an interactive data analysis environment.
Figure 3.4 presents a possible scenario that shows how an end user can
operate over a data cube in order to analyze data in different ways. Later
in this section, we present the OLAP operations in detail. Our user starts
from Fig. 3.4 a, a cube containing quarterly sales quantities (in thousands) by
product categories and customer cities for the year 2012.
The user first wants to compute the sales quantities by country. For this,
she applies a roll-up operation to the Country level along the Customer
dimension. The result is shown in Fig. 3.4 b. While the original cube contained
four values in the Customer dimension, one for each city, the new cube
contains two values, each one corresponding to one country. The remaining
dimensions are not affected. Thus, the values in cells pertaining to Paris and
Lyon in a given quarter and category contribute to the aggregation of the
corresponding values for France. The computation of the cells pertaining to
Germany proceeds analogously.
Search WWH ::




Custom Search