View Management Techniques and Their Application to Data Stream Management - Evolving Application Domains of Data Warehousing and Mining

Database Reference

In-Depth Information

versions of the database and avoiding locking.

A different paradigm has been proposed by

Zhou et al. (2007). Whereas most approaches trig-

ger the view maintenance process when an update

occurs, in this approach the maintenance can be

deferred until the system has resources available

to process the view maintenance task, or a user

accesses a view which is affected by the update.

By doing so, the burden of view maintenance is

removed from the update transactions.

View selection is handled as an optimiza-

tion problem, which can be described as a tuple

(S;V;M;Q) , where S is the schema together with

some size estimation model, V is the set of all the

possible views to choose from, M is a quantity

denoting the space available for materializing

views, and Q is a set of queries. The problem is

to find a subset of V that optimizes the queries

against Q while the storage requirement is no larger

than M . The overall evaluation cost is formulated

as a weighted sum over Q :

vIeW SelectIon

E ( V ) = Σ q ∈ Q E ( q , V ) × f q

where f q denotes the frequency of query q , and

E(q,V) is the minimal possible evaluation cost of

q using the selected views V. Besides the cost of

queries, view maintenance can be considered in

a similar way:

View selection addresses the problem of choosing

materialized views in the design process of data

warehouses. Informally, it can be stated as, given

an estimated query workload and possibly also sets

of candidate views, to select a set of views, such

that some operation goals (e.g., average query

response time, maintenance costs, or both) are op-

timized and meanwhile some resource constraints

are satisfied. View selection, view maintenance,

and answering queries using views (Halevy, 2001)

comprise a relatively complete framework for

query processing in data warehousing.

In this section, we discuss the challenges

of and solutions to the view selection problem.

We first state the formal framework of the view

selection problem and some theoretical aspects

based on the framework (Section 4.1). We then

describe a taxonomy that we use to classify view

selection techniques (Section 4.2). The main body

of view selection approaches in the static setting

is then discussed (Section 4.3), where the query

workload is assumed to be known in advance.

Finally, dynamic view selection approaches are

introduced (Section 4.4).

U ( V ) = Σ v ∈ V U ( v ) × f v

where f v denotes the frequency of updating view

v , and U(v) is the cost of updating the view. Thus,

the task is a multi-goal optimization problem:

C ( V ) = E ( V ) + α U ( V )

where α is a positive weight parameter ad-

justing these two goals. Normally, these two

goals lead to different solutions if considered

separately. For example, given a large enough

storage, minimizing query evaluation requires

all queries in the workload to be materialized as

views, while update cost is minimized when only

base relations are stored.

The view selection problem is known to be

NP-hard , since even the simplest form in a hyper-

cube has a reduction from the set cover problem

(Harinarayan et al., 1996).

theoretical Perspectives

We focus on static view selection in this section

and start by introducing a model treating the view

selection problem as an optimization problem.

Evolving Application Domains of Data Warehousing and Mining

Search WWH ::

Custom Search

Home