Information Technology Reference
In-Depth Information
The challenge faced by existing modelling patterns is to address business informa-
tional requirements by providing the adequate representation of interactions between
dimensions and facts in addition to the representation of relationships between levels
of aggregation within dimensions hierarchy. As such, it is most important to develop
appropriate data models to support querying, exploring, reporting and analysis of
business data [3]. In this domain, one relevant concern is to avoid summarizability
(also known as additivity) problems in order to avoid erroneous results when data is
aggregated and to avoid double counting due to non-strictness.
The term “Non-strict” dimensioning is commonly used to define many-to-many
fact-dimension relationships. Current approaches [4] try to overcome the problem by
finding ways for each fact row to be related to exactly one row in each dimension
table. The fact table is therefore referred to as a dependent entity as known from stan-
dard relational database theory. In other words, the fact grain is determined by all
dimensions, where the primary key of the fact table is composed of foreign keys of
the dimension tables to which that fact table relates [3]. This means that fact-
dimension relationships can be characterized by the multiplicities of the association
between fact and dimension. In this context, the term ''Regular” normally means that
summarizability is ensured while the term ''Unusual” denotes situations which violate
summarizability. To guarantee summarizability, a dimension is required to be strict,
that is, every element of the dimension instance must have a unique ancestor in each
of its ancestor categories.
To avoid erroneous results, a multidimensional model must have a consistent gra-
nularity, which means that the grain of facts is determined by all dimensions defining
the scope of the measure in the fact. This assumption enforces many-to-one relation-
ships between a fact and a dimension [5]. However in many real-world situations,
designers must deal with scenarios in which different granularities are necessary
and where fact-dimension relationships can have different multiplicities, leading to
summarizability problems.
Existing DW modelling approaches can fall within the following basic groups:
data-driven, goal-driven and user-driven, each of these approaches advocate only a single
principle in the DW design method causing query readability, performance, and structure
to degrade severely [6]. The novel work presented in this paper addresses a multi-driven
approach that integrates contributions from other research works [2, 6, 8, 9] and with a
technique to ensure summarizability-compliance at the physical level independently of the
design pattern that is adopted. This will be illustrated by using examples based on our
OOP-DW case study.
The remaining sections are organized as follows: Section 2 presents a short over-
view to the multidimensional paradigm, outlining the challenges of existing modelling
patterns to avoid multi-valued dimensions and double-counting problems. Section 3
presents the case study of the Public Works Observatory (OOP), an information sys-
tem with information related to the execution of public works contracts. This section
also provides an introduction to the multi-driven approach followed to overcome the
summarizability problem outlined in the OOP case study. Section 4 describes the
innovative technology-driven solution implemented to overcome multi-valued dimen-
sions and double-counting problems without affecting the OOP multidimensional data
model. Finally Sections 5 presents the conclusions of the paper.
Search WWH ::




Custom Search