Information Technology Reference
In-Depth Information
2
Related Work
It is widely accepted that the data warehouse must be structured according to the mul-
tidimensional model to facilitate OLAP analysis. Indeed the main role of a DW is to
integrate data in a huge time-variant and non-volatile repository of data, providing
business stakeholders (e.g., analysts, operational decision makers, senior managers,
directors, etc.) with a business-oriented view to exploit and extract relevant know-
ledge [3]. The multidimensional model is distinguished by the fact/dimension dichot-
omy, where fact contains numeric attributes which represent the measure of a
business and where the fact relates to a set of dimensions which represents the differ-
ent perspective by which the fact is analysed. The attributes of dimension are either in
a hierarchy or just descriptive [7]. The hierarchies allow for obtaining views of data
with different granularity, i.e. summarized or detailed through roll-up and drill-down
operations respectively.
As such, the key for designing a good multidimensional model depends on how
well the data model is constructed and mapped in subsequent logical and physical
models. It is assumed that an ideal scenario to derive the DW conceptual schema
would embrace a multi-driven approach, restringing in a multidimensional schema
that would satisfy the end-user requirements and capture the analysis potential de-
picted in the data sources [8]. Table 1 presents a short description about existing DW
modelling approaches. These approaches are concerned with either i) how to model
complex dimension hierarchies or ii) how to model relationships between facts and
dimensions. Surprisingly, most of the work is only concerned with non-strictness,
thus ignoring incomplete relationships.
The simplest way to deal with non-strictness is to decompose the many-to-many
relationship in two or more one-to-many relationships using a bridge table [3]. Com-
pared to a solution that simplifies a many-to-many relationship, a bridge table is much
more flexible but brings with it the danger of double-counting (e.g., see section 3,
grouping results at any other level of summarization may result in double-counting),
with implications to the ETL process. Or worse the bridge table is likely to grow very
large, and it may even contain more rows than the fact table. If a bridged solution is
chosen, special attention must also be given to the behaviour of slowly changing di-
mensions, and it may be necessary to add another table to satisfy the quirks of some
OLAP tools (as it is the case of the Pentaho tool used in the case study).
Existing DW modelling approaches also claim to fully automate the design task but
lack mechanisms through which to formally match the data sources with information
requirements in the early stages of the development, thus making it highly complex to
populate the data warehouse in a proper manner.
Search WWH ::




Custom Search