Database Reference
In-Depth Information
Impacts of Data Warehouse
Changes on Cell Data
Not dealing with such changes may again
lead to incorrect data. An example for this
type may be changing the calculation rule
for the unemployment rate.
The different types of changes in data warehouse
structures affect the assigned cell data in different
ways. The following impacts on the cell data can
be identified:
Changing Cell Data: Typically, cell data
contained in a data warehouse is non-vol-
atile, thus not changing any more (Inmon
2005, p. 29). Therefore changing cell data
is typically not considered during of data
warehouse maintenance.
Additional Cell Data: Additional data is
most times expressed by new structure ele-
ments not depending on previously existing
elements, either on the schema (dimension,
categories) or on the instance (dimension
members) level. The problem with addi-
tional cell data is that it is oftentimes not
available for the past thus, leading to miss-
ing data. An example could be new coun-
tries joining the European Union.
Data Warehouse Evolution
and Versioning
When it comes to changing systems, one question
always arising is whether to keep previous ver-
sions available or not. This also applies to data
warehouse maintenance. With Data Warehouse
Evolution every modification (or every set of
modifications applied in one transaction) of the
structure leads to a new version of this structure
and the previous version is lost. The contained cell
data has to be transformed to be consistent with the
new structure. Queries can only be executed based
on the newest structure. With Data Warehouse
Versioning also every modification (or every set
of modifications applied in one transaction) of the
structure leads to a new version of this structure,
but the previous version is kept available and may
be restored. Existing cell data does not need to
be adapted, but can be stored following is origin
structure version. This enables queries spanning
multiple structure versions.
Comparing these two possibilities reveals
advantages and disadvantages for both of them:
When using the evolution approach, one does
not need to keep track of old versions, which
reduces data management effort and as all data
is adapted to the current version, no adaptations
have to be done during runtime, which means a
better query performance, compared to the ver-
sioning approach. On the other hand, evolution
lacks the possibility of multiversion queries and
as the data has to be transformed, change opera-
Removed Cell Data: When dimension
members or categories are completely
removed from the data warehouse struc-
ture, also the assigned cell data should no
longer be part of the cube. For instance,
data from countries leaving the European
Union should not be contained in statistics
any longer.
Restructuring of Cell Data: Restructuring
cell data can happen in various forms: ag-
gregation hierarchies may change, new ag-
gregation levels may be created, dimension
members may be split or merged, the unit
of a fact may change. All these restructur-
ings have in common that the cell data for
the new version can - to a certain extent
- be calculated from the old structure. Not
dealing with such changes may lead to
incorrect data. Examples for this type of
changes may be the split of a country in
the European Union into two independent
countries, or the changing the currency.
Change of Cell Data Calculation: Cell
data may also be calculated from various
external data sources by some given for-
mulae. Such calculation rule may change.
Search WWH ::




Custom Search