Database Reference
In-Depth Information
2006), but due to the static nature of dimensional
schemas, conventional data warehouses may not
be able to store some of the real-world data after
the changes. Consequently, up-to-date data cannot
be provided for decision-support and so decision-
quality suffers. For this reason, the data warehouse
should be developed in such a way that the data
produced before the change and after the change
could be simultaneously stored in it.
(Wrembel, 2005) has identified real-world
events which may lead to changes in the data
warehouse. These are: changes in borders, changes
in administrative structure of an organization, new
user requirements, new business markets, estab-
lishing new departments, merging the existing
departments and business reengineering. In order
to motivate the need for creating and maintaining
multiple versions of a data warehouse, description
of problems that cannot be treated in conventional
data warehouse is given next. Conventional data
warehouses do not adapt to some types of changes
in the operational sources, which motivates the
need for creating versions of the data warehouse.
Here we describe the shortcomings of conventional
data warehouses that were pointed-out in a number
of conference, journal and workshop papers, as
well as technical reports. These are:
2.
Information loss: Conventional data ware-
houses cannot keep track of changes in the
structure of the data warehouse itself without
loss of any information. As soon as a change
comes to a data warehouse, according to
the evolution approach, its schema is up-
graded and data is transferred to the newly
born schema (Golfarelli, 2006; Wrembel,
2005a) i.e. data is populated in the newly
born schema. This way data is transferred,
but old information is lost at the same time.
Therefore, old information is not available
for access and use by inquirers. This is called
the information loss problem . This is due to
the reason that old information is overwrit-
ten by new information. Overwriting of
old information takes place because data
warehouse schemas have a static structure
and it is not flexible enough to store two
instances of information.
3.
Schema changes in the data warehouse:
Schema changes in operational sources may
lead to changes in the data warehouse schema
(Golfarelli, 2006). There are several real-
world changes, such as changes in borders,
new user requirements and new business
markets etc., which may lead to changes in
the schema of the data warehouse. However,
conventional data warehouses cannot handle
these changes without modifications in the
structure of the data warehouse existing
before the schema changes. This is due to
the reason that traditional data warehouses
have a static structure concerning their
schemas and relationships between data,
therefore they may not be able to support
any dynamics in their structure and contents
(Wrembel, 2005 a).
1.
Adaptation of instances: The data warehouse,
due to its content, depends upon operational
sources. Therefore, as soon as operational
sources are changed, the data warehouse
should also accommodate those changes. For
population of the data warehouse, “extrac-
tion, transformation and loading” (ETL)
components transfer data from operational
sources to that data warehouse. Due to the
inability of ETL to sense structure changes
in operational sources, data warehouses can-
not adapt to those changes. This is due to
the fact that ETL is not dynamic enough to
support changes in sources. This problem is
also called the incremental view maintenance
problem (Bebel, 2004; Morzy, 2003).
4.
Tracking of evolution operations in meta-
data: the operations that result in evolution
of versions cannot be tracked in conventional
data warehouses. As indicated above (in
3 rd point), conventional data warehouses
are static in nature, therefore they cannot
Search WWH ::




Custom Search