Database Reference
In-Depth Information
Data warehouses rely heavily on analysis of
up-to-date information to support decision makers.
The advent of a new class of data management
applications, namely data stream management
systems (DSMS), provides new opportunities for
analysis of timely information. A data stream is
a continuous, rapid, time-varying, and transient
stream of data. There are connections between
DSMS and view management. Whereas continu-
ous query processing is related to view mainte-
nance in data warehousing, multi-query optimi-
zation for continuous queries is highly related to
view selection in conventional relational DBMS
and data warehouses. In this chapter, we give an
overview of view maintenance and view selec-
tion methods, explain the fundamental issues of
data stream management, and discuss how view
management techniques from data warehousing
are related to data stream management.
The chapter is structured as follows: sec-
tion 2 briefly explains the roles of views in data
warehouses. Section 3 gives an overview of view
maintenance methods and classifies them accord-
ing to various criteria. Then, section 4 explains the
view selection problem and presents a taxonomy
of existing view selection techniques. Section 5
discusses issues and challenges in data stream
management and summarizes recent results in
research on data streams. Section 6 discusses
the relationship of view management techniques
to data stream management. Similarities, differ-
ences and possible connections between data
stream management and view management are
discussed. Finally, section 7 summarizes the
chapter and points out directions for future re-
search in view management, data streams, and
data warehousing.
common data operations, data warehouses aim at
supporting data analysis (i.e., On-Line Analytical
Processing, OLAP) and are known for their vast
volume of data and complexity of queries. The
response time of queries, if evaluated from base
tables, is usually too long for users to tolerate
as a huge amount of data has to be processed.
Therefore, it is a common practice to pre-compute
summaries of base tables in order to reduce the
query response time. The following example il-
lustrates the benefit of materializing views:
Example 1 Consider the TPC-D benchmark
(Serlin, 1993), modeling a data cube of sales with
three dimensions: part, supplier, and customer. We
denote the base table as R(part; supp; cust; sales).
The following query is posed by users:
Q : SELECT part, SUM(sales) AS total
FROM R
GROUP BY part;
The following two materialized views can
both benefit Q:
V 1 : SELECT part, cust, SUM(sales) AS total
FROM R
GROUP BY part, cust;
V 2 : SELECT part, supp, SUM(sales) AS
total
FROM R
GROUP BY part, supp;
It depends on the statistics of the data to decide
which view is better in terms of query response
or storage cost. For instance, the statistics of the
TPC-D database are as follows:
R : 6M rows
V 1 : 6M rows
V 2 : 0.8M rows
It is easy to see that materializing V 2 will
benefit answering Q , because V 2 is much smaller
to scan than the base table. Meanwhile, V 1 is not
quite useful since it has a comparable size to the
base table.
Nonetheless, materialization of views comes at
some price. On the one hand, materializing views
views in data Warehousing
A view can select or restructure data in such a way
that an application can use the data more efficiently.
Different from On-Line Transaction Processing
(OLTP) systems, which focus at managing the
Search WWH ::




Custom Search