Database Reference
In-Depth Information
porating data from heterogeneous sources into a
single database in order to provide a consistent and
unified view of data. These early discussions, such
as database snapshots (Adiba & Lindsay, 1980)
and materialized views (Gupta & Mumick, 1995),
motivated a big variety of subsequent research
tracks such as OLAP (online analytical process-
ing) databases (Chaudhuri & Dayal, 1997), data
cube (Gray, Bosworth, Layman, & Pirahesh,1996),
multidimensional modeling (Agrawal, Gupta, &
Sarawagi, 1997), multidimensional indexing and
query optimization (Böhm, Berchtold, Kriegel, &
Michel, 2000), and data warehousing for com-
plex data types (Pedersen & Jensen, 1999). The
research world has been putting recent attentions
on improving the scalability of data warehouses on
complex data types (Darmont, Boussaid, Ralaivao,
& Aouiche, 2005) and how data warehouse can
be seamlessly and efficiently integrated into the
business intelligence process and applications
(Furtado, 2006; Theodoratos, Ligoudistianos &
Sellis, 2001).
Data warehouse architecture is a portfolio of
perspectives on how different architecture pieces
of a data warehouse system are connected and
interacting with each other. It reflects how the
academic research and industry development
influence the data warehousing practices of dif-
ferent enterprises. For example, from a computing
infrastructure perspective, data warehouse archi-
tecture has gone from past mainframe analytics
to client/middleware/server environment, and
now to service-oriented computing as well as the
cloud computing concepts. With the rapid growth
of information volume and more requirements
arriving from the business side, many IT orga-
nizations of large business enterprises are facing
the challenge of building an enterprise-wide data
warehouse that integrates and manages various
types of information that comes from different
corners of the enterprises and provides the solid
information for business analysis in a timely
manner. Successful data warehouse architecture
must be able to ensure the processing efficiency,
the information correctness, and propagation of
metadata while managing over terabytes of data
with a daily growth of over gigabytes.
As in the past decade, practices of data ware-
house architecture have been focused on address-
ing classical issues such as the data integration
needs, the data quality and metadata control, the
data modeling requirements and the performance
acceptance from both the data management and the
analytical sides. Specifically, the data extraction,
transformation, and loading (ETL) process has
to manage large volumes of data in an efficient
manner by allowing easy and fast scaling up/out
hardware configurations. Extraction of metadata
and reconciliation of data quality requirements
must also be fulfilled through the data integra-
tion process in order to enable the data lineage
across the whole data lifecycle in the warehouse.
An enterprise-wide data model provides unified,
consolidated view of the data which enables a
consistent, logical representation of business
data across different functional areas of a whole
enterprise. As the data management side of data
warehouse is focused on loading the data in an ef-
ficient manner while the analytical users are more
interested in retrieving data in a fast and agile way,
data warehouse architecture has to enable an easy
way of finding the balance of both sides.
Built upon the past decade's research explo-
rations, data warehouse software vendors are
instantiating tools and engineering practices on
these classical architecture topics. While vendors
are rolling out more and more parallel-processing
database and ETL engines, enterprise-wide meta-
data and data quality tools, and eagerly extending
their center of excellence with vast amount of data
warehousing practices, both the data warehouse
industry and academic worlds are facing new
challenges when novel concepts such as SOA,
web 2.0, and cloud computing are spreading over
the whole IT community. This chapter is devoted
to addressing what challenges these new trends
bring to the data warehouse architecture and how
the different academic research can be used to
Search WWH ::




Custom Search