Database Reference
In-Depth Information
the query processing cost and without increasing
significantly the view maintenance.
Another point should be taken into account.
Assume in the previous example that q 1 is fre-
quently used at both S 2 , S 3 in this case material-
izing q 1 only at one site, e.g., S 1 can be benefit
enough for the both sites as the communication
cost between these sites are not high. And in this
way the available storage at S 2 is saved for another
benefit materialized view. Finally, this example
demonstrates that the question of selecting appro-
priate views for materialization is not an obvious
issue and should deal with different parameters
like storage space constraint, view maintenance
cost and communication cost.
different queries. The Multi View Materializa-
tion Graph (MVMG) is similar to the AND-OR
DAG representation of queries in multi query
optimization. The MVMG is a bipartite Directed
Acyclic Graph (DAG) composed of two types
of nodes: AND-nodes and OR nodes. Each
AND-node represents an algebraic expression
(Select-Project-Join) with possible aggregate
function. An OR node represents a set of logical
expression that are equivalent (i.e., that yield
the same result). The AND-nodes have only
OR-nodes as children and OR-nodes have only
AND-nodes as children. The MVMG represents
AND-OR DAGs of several queries in a single
DAG. The leaf nodes of the MVMG are equiva-
lence nodes representing the base relations. In
general, for each base relation, there is one leaf
node except in case of a self join. Equivalence
nodes in MVMG correspond to the views that
are candidate to the materialization.
The queries used in this chapter are defined
over the TPC-H schema described in Figure 1 .
The Part table contains information about each
product. The Supplier table describes the sup-
plier of the products. Because a supplier is able
to supply more than one product and a product
may be supplied by several suppliers: there is a
table PartSupp containing information about who
supplies what products, in which quantity and so
on. The Customer table describes the information
concerning the clients like the name, address
and the nation. The sales are represented by two
relations: Orders and Lineitem . The table Orders
contains information about the customer, the price
of ordered items and so on. Finally, the Lineitem
table contains for each ordered item, the ordered
quantity and the corresponding supplier.
Let us consider the query Q 1, which finds
the number of orders of Airbus planes ordered
by different nations.Q 1 : Select N.name, P.brand,
O.orderdate, Sum(L.quantity)From Part P, Cus-
tomer C, Orders O, Nation N, Lineitem LWhere
P.type = 'airplane'and AND P.brand = 'Airbus'
AND P.partkey = L.partkey AND L.orderkey =
VIEW SELECTION PROBLEM
The general problem of view selection is to select
a set of views to materialize that optimizes both
the view maintenance and query processing time
given some constraints like storage space. To find
the optimal solution satisfying all constraints is a
NP-complete problem, therefore it is necessary
to develop heuristics. In this chapter, we present
a dynamic view selection method which includes
a replacement policy of already materialized
views.
More formally, given a set of source relations
R={R 1 ,R 2 ,…,R n } a set of conjunctive queries of
workload Q = {Q 1 ,Q 2 ,…,Q k }, the problem is to
find a set of views to materialize M={V 1 ,V 2 , …,V}
under a storage space constraint. However, in this
paper, the view selection algorithm is applied as
a query arrives. Therefore, the workload is built
incrementally.
Framework for Detecting
Common Views
In this subsection, we present a framework
for representing views to materialize in order
to exhibit common sub-expressions between
Search WWH ::




Custom Search