Database Reference
In-Depth Information
Contribution
Motivating Example
We have designed a view selection method based
on statistic information. This method provides the
following features:
Let's start with a simple example showing how
materialized views can improve query perfor-
mance time in distributed context. Let us consider
three sites S 1 , S 2 , S 3 which are connected by a
slow wide-area network. S 1 , S 2 and S 2 , S 3 are
connected by an edge with the speed 100 KB/
sec, while the connection speed between S 2 and
S 3 is 3 MB/sec. Let us consider three relations
A, B, C stored at S 1 , S 2 , S 3 respectively and two
queries q 1 and q 2 defined as follows. q 1 =A equijoin
C is posed frequently at S 2 , and q 2 = A equijoin
B is posed frequently at S 3 . Suppose now that
the query optimizer decides to execute all the
queries at S 1 and the results size of q 1 , q 2 are 1
MB, 2 MB respectively. So, the cost to answer
q 1 each time at S 2 is 10 sec and to answer q 2 at S 3
is 20 sec, in addition to the cost to compute both
queries at S 1 . This cost has to be paid every time
these queries are posed. While, if the result of q 1 ,
q 2 are materialized at S 2 , S 3 respectively, in this
case only both queries has to be computed once
at S1 and the results are shipped to S 2 , S 3 to be
materialized with the transfer cost of 10, 20 sec
respectively. Then, subsequent queries could be
performed with zero cost. Obviously, material-
ized views have significantly reduced the query
response time. However, two problems can arise:
First, sometimes the result of the queries is too
large to fit in the available space at a specific site,
so we can't materialize the entire workload and
we have to select only the most beneficial queries
to be materialized. Second, when the base relation
is updated, then the view should be maintained
and the cost of the view maintenance can cause
overhead to the system. Assume in our example
that the relation C is frequently updated, in this
case materializing q 2 can be less attractive because
the cost that we save for answering q 2 is now
involved for maintaining the materialized view.
Therefore, the view is considered as beneficial if
and only if its materialization reduces significantly
It is dynamic: it treats a query as it arrives
It is based on statistic measures for esti-
mating the query processing
It is using a replacement policy to material-
ize the most beneficial views
The algorithm for finding the best views to
materialize is polynomial
Outline
The rest of this chapter is organized as follows. In
Section 2, we present the problem of view selection
and a framework for representing the queries in
order to detect common sub expressions. Section 3
presents our dynamic approach to view selection.
The materialization and replacement policy and
its related algorithms are described in Section 4.
In Section 5, we present a tool based on our view
materialization method for tuning the performance
of the DBMS. We also show that is scalable and
provides good performance. Finally, Section 6
presents related work and Section 7 contains
concluding remarks and future work.
PRELIMINARIES
A view is a derived relation defined by a query in
terms of source relations and/or other views. It is
said to be materialized when its extent is computed
and persistently stored. Otherwise, it is said to be
virtual. We consider Selection-Projection-Join
(SPJ) views that may involve aggregation and a
group by clause as well.
Search WWH ::




Custom Search