Database Reference
In-Depth Information
Euro would notice a giant retracement in the year
1999. But, of course, someone who knows about
the Euro can divide each value given in ATS by
13.7603 and then compare the values.
Besides such simple “unit changes” there
may be also more complex semantic changes for
dimension members. Consider a query analyzing
the unemployment rate in the European Union.
Not only that it is calculated in different ways for
various countries, the calculation mode has also
been changed several times in the last few years,
for instance whether people who are attending
coursed offered by federal employment offices are
counted as unemployed or not. Such calculation
methods may be contained in the data warehouse
definition as formulae for a certain member.
Generally, the three basic aspects of change
management in data warehouses can be identified
as follows:
The administrator is then, being aware that
changes happened, in charge of finding the
modifications and correctly dealing with
them. Executing this task manually is very
time consuming and also error prone, espe-
cially if the affected dimensions are large
and the changes are rather small.
3.
Dealing with the influence of structure
changes on the cell data: Some structural
changes may have heavy influence on the
cell data. One of the main reasons for data
warehouse maintenance is comparability of
cell data. Now comparing cell data stemming
from before and after a structural change
may be very complex or even impossible,
because of the influence of the structure
changes on the cell data. In this context two
major problems can be identified
a.
Missing Data: If elements are inserted
or removed from the structure, cell data
may not be available for the whole
period of analysis (e.g. missing data
for new countries in the European
Union).
1.
Being aware about changes happening:
First of all, to be able to manage changes, it
is necessary to be aware of their existence .
This awareness can easily be seen from two
typical reasons for data warehouse structure
changes. The first reason are changes in the
real world . that is represented by the data
warehouse, for instance creation of a new
department, a merge of different depart-
ments, or new countries joining the European
Union.. The second reason for modifications
in the data warehouse are changing require-
ments , for instance analyzing not only the
turnover in a company, but also the gain, or
keeping track of unemployment rates, which
were not recorded before.
b.
Incorrect Data: If structure elements
are changed, their semantics may
change. This could have an influence
on the calculation of cell values. Thus,
if comparing cell values from before
and after the change, equal values may
have a different meaning and vice versa
(e.g. different methods for calculating
the unemployment rate).
These examples illustrate the problems induced
by changing structures on a very simple level.
Froeschl, Yamada and Kudrna (2002) call this the
problem of footnotes in statistics, i.e. many values
have to be tagged with their correct semantics.
When being aware of such semantic and structural
changes, interpreting “strange” results may be
cumbersome but possible. But if someone does
not even know that there were changes, analyzing
query results may be impossible or, even worse,
2.
Identifying the changes in the system:
As today's typical data warehouse systems
typically do not support changing structure
data, they also may not be able to provide
information about them, for instance some
sort of a change log, even if the happened,
e.g. by some automatic ETL process recre-
ating the data warehouse from its sources.
Search WWH ::




Custom Search