Database Reference
In-Depth Information
techniques are normally used for improving system performance: materialized
views, indexing, and data partitioning. In particular, bitmap indexes are used
in the data warehousing context, opposite to operational databases, where B-
tree indexes are typically used. A huge amount of research in these topics had
been performed particularly during the second half of the 1990s. In Chap. 7 ,
we review and study these efforts.
Although data warehouses are, in the end, a particular kind of databases,
there are significant differences between the development of operational
databases and data warehouses. A key one is the fact that data in a warehouse
are extracted from several source systems. Thus, data must be taken from
these sources, transformed to fit the data warehouse model, and loaded into
the data warehouse. This process is called extraction, transformation,
and loading (ETL), and it has been proven crucial for the success of a
data warehousing project. However, in spite of the work carried out in this
topic, again, there is still no consensus on a methodology for ETL design,
and most problems are solved in an ad hoc manner. There exist, however,
several proposals regarding ETL conceptual design. We study the design and
implementation of ETL processes in Chap. 8 .
Data analytics is the process of exploiting the contents of a data
warehouse in order to provide essential information to the decision-making
process. Three main tools can be used for this. Data mining consists in a
series of statistical techniques that analyze the data in a warehouse in order
to discover useful knowledge that is not easy to obtain from the original
data. Key performance indicators (KPIs) are measurable organizational
objectives that are used for characterizing how an organization is performing.
Finally, dashboards are interactive reports that present the data in a
warehouse, including the KPIs, in a visual way, providing an overview of
the performance of an organization for decision-support purposes. We study
data analytics in Chap. 9 .
Designing a data warehouse is a complex endeavor that needs to be
carefully carried out. As for operational databases, several phases are
needed to design a data warehouse, where each phase addresses specific
considerations that must be taken into account. As mentioned above, these
phases are requirements specification, conceptual design, logical design,
and physical design. There are three different approaches to requirements
specification, which differ on how requirements are collected: from users, by
analyzing source systems, or by combining both. The choice on the particular
approach followed determines how the subsequent phase of conceptual design
is undertaken. We study in Chap. 10 a method for data warehouse design.
By the beginning of this century, the foundational concepts of data
warehouse systems were mature and consolidated. Starting from these
concepts, the field has been steadily growing in many different ways. On the
one hand, new kinds of data and data models have been introduced. Some of
them have been successfully implemented into commercial and open-source
systems. This is the case of spatial data. On the other hand, new architectures
Search WWH ::




Custom Search