View Management Techniques and Their Application to Data Stream Management - Evolving Application Domains of Data Warehousing and Mining

Database Reference

In-Depth Information

ratio (CSR), which measures the percentage of

total query costs saved due to hits in the cache.

Because of the chunking organization in (Desh-

pande et al., 1998), a simple metric of coverage

of base tables is adopted for measuring benefits

of cached chunks.

Recency metrics of caching are well studied

in the literature. One well known strategy is Least

Recently Used (LRU), which discards the oldest

data cached. The strategy is extended to LRU-K

by O'Neil et al. (1993) to take advantage of recent

access patterns. Deshpande et al. (1998) utilize

the CLOCK algorithm (Silberschatz et al., 2002),

an efficient approximation of LRU.

There are two ways to use the benefit metric

and the recency metric simultaneously. One way

is to consider the recency and benefit in parallel:

first use LRU to select a candidate set and then use

benefit to decide on replacement (Scheuermann

et al., 1996). The other way is to use an aging

strategy to obtain benefit in the recent time win-

dow and then use the windowed benefit for both

candidate set selection and replacement decision

(Deshpande et al., 1998).

analysis. The characteristics of this family of ap-

plications have quite some implications on storage

and query processing that make it impossible to

use conventional DBMSs for such tasks.

These new requirements gave rise to a new

class of data management systems, so called data

stream management systems (DSMS) (Babcock

et al., 2002). Although there are similarities

between data stream management systems and

conventional database management systems, the

requirements of data stream analysis necessitate

new types of queries and new query evaluation

techniques.

Issues in DSMS include that special care has

to be taken in incremental computation of state-

ful operators like joins and aggregations, because

they could block a query. Queries are usually only

evaluated over a window of most recent data,

since otherwise the amount of data would grow

unbounded. All operators must process incoming

tuples incrementally. Furthermore, continuous

query evaluation must take into account common

subexpressions of queries registered with the

streams to execute the same operators only once

and stream the results to subsequent operators of

concurrent queries.

This section surveys different aspects of

DSMSs and queries against data streams. After

giving a formal definition of the notion of a data

stream, we describe different possibilities to de-

fine the portions of streams to be used for query

answering. These methods, called window models,

are a particularly characteristic feature of DSMSs

and a means for computing approximate query

answers. Due to the amount of incoming data,

storage of streaming data is often only possible in

an aggregated form. Therefore, we conclude the

discussion with a survey on different techniques

to produce such synopses or digests of stream-

ing data.

For the remainder of this section we adopt

some definitions by Arasu et al. (2006) since most

other models can be reduced to their model. We

dAtA StreAM MAnAgeMent

Whereas in traditional database management

systems different queries are posed against static

data, in many applications a relatively fixed set of

processing tasks must be evaluated against an ever

changing sequence of data tuples. Such monitor-

ing applications (Abadi et al., 2003) evaluate their

queries against streams of data. In contrast to a

database that entirely resides in a set of (virtual)

files, a data stream is a rapidly flowing stream of

structured data that is so vast in its amount that it

is usually impossible to store the complete data

on persistent memory.

Common examples of monitoring applications

are analysis of financial tickers, web click stream

analysis, traffic monitoring, or network traffic

Evolving Application Domains of Data Warehousing and Mining

Search WWH ::

Custom Search

Home