Database Reference
In-Depth Information
interests for analysis. Measures are included in
the bottom section of the fact. For instance, each
invoice line is measured by the number of units
sold, the price per unit, the net amount, etc. The
reason why measures should be numerical is that
they are used for computations. A fact may also
have no measures, if the only interesting thing to
be recorded is the occurrence of events; in this case
the fact schema is said to be empty and is typically
queried to count the events that occurred.
A dimension is a fact property with a finite
domain, and describes one of its analysis coordi-
nates. The set of dimensions of a fact determine
its finest representation granularity. Graphically,
dimensions are represented as circles attached to
the fact by straight lines. Typical dimensions for
the invoice fact are product, customer, agent. Usu-
ally one of the dimensions of the fact represents
the time (at any granularity) that is necessary to
extract time series from the DW data.
The relationship between measures and dimen-
sions is expressed, at the instance level, by the
concept of event.A primary event is an occurrence
of a fact, and is identified by a tuple of values,
one for each dimension. Each primary event is
described by one value for each measure. Primary
events are the elemental information which can
be represented (in the cube metaphor, they cor-
respond to the cube cells). In the invoice example
they model the invoicing of one product to one
customer made by one agent on one day.
Aggregation is the basic OLAP operation,
since it allows significant information to be
summarized from large amounts of data. From a
conceptual point of view, aggregation is carried
out on primary events thanks to the definition of
dimension attributes and hierarchies.A dimension
attribute is a property, with a finite domain, of a
dimension. Like dimensions, it is represented by
a circle. For instance, a product is described by
its type, category, and brand; a customer, by its
city and its nation.
The relationships between dimension attributes
are expressed by hierarchies. A hierarchy is a di-
rected graph, rooted in a dimension, whose nodes
are all the dimension attributes that describe that
dimension, and whose arcs model many-to-one
associations between pairs of dimension attributes.
Arcs are graphically represented by straight lines.
Hierarchies should reproduce the pattern of inter-
attribute functional dependencies expressed by the
data source. Hierarchies determine how primary
events can be aggregated into secondary events
and selected significantly for the decision-making
process. Given a set of dimension attributes, each
tuple of their values identifies a secondary event
that aggregates all the corresponding primary
events. Each secondary event is described by a
value for each measure, that summarizes the values
taken by the same measure in the corresponding
primary events.
The dimension in which a hierarchy is rooted
defines its finest aggregation granularity, while
the other dimension attributes progressively define
coarser ones. For instance, thanks to the existence
of a many-to-one association between products
and their categories, the invoicing events may be
grouped according to the category of the products.
When two nodes a 1 , a 2 of a hierarchy share the same
descendent a 3 (i.e. when two dimension attributes
within a hierarchy are connected by two or more
alternative paths of many-to-one associations) this
is the case of a convergence , meaning that for each
instance of the hierarchy we can have different
values for a 1 , a 2 , but we will have only one value
for a 3 . For example, in the geographic hierarchy on
dimension customer (Figure 2): customers live in
cities, which are grouped into states belonging to
nations. Suppose that customers are also grouped
into sales districts, and that no inclusion relation-
ships exist between districts and cities/states; on
the other hand, sales districts never cross the nation
boundaries. In this case, each customer belongs
to exactly one nation whichever of the two paths
is followed (customer→city→state→nation or
customer→sale district→nation).
It should be noted that the existence of appar-
ently equal attributes does not always determine
Search WWH ::




Custom Search