View Management Techniques and Their Application to Data Stream Management - Evolving Application Domains of Data Warehousing and Mining

Database Reference

In-Depth Information

Window Models

on a subset of relational algebra shown to have

tractable complexity of maintaining views. Al-

though the authors do not mention the term data

stream, the requirements that have motivated

the development of the chronicle data model are

the same as for DSMSs. Their system is used

for maintenance of materialized views. Its query

language is basically a´restriction of SQL with

some special extensions. It allows for queries

including relations, views, and chronicles. In

this scenario, a chronicle is a sequence of tuple

insertions. It is modeled as a relation with an extra

sequencing attribute, which can be regarded as a

timestamp. Based on the chronicle materialized

views are maintained. Thus, like a data stream

query, a materialized view is a result set being

continually updated.

Arasu et al. (2006) developed the continuous

query language CQL.A formal abstract semantics

for CQL as well as some details about its query

plans are given. The language has been (almost

completely) implemented in the DSMS prototype

system STREAM.

The central idea of CQL is to reuse the large

body of research that has been conducted in the

area or relational databases. CQL therefore is

comprised of a set of stream-to-relation operators

that are essentially window definition operators

(cf. section 5.2), the relation-to-relation operators

well known from the relational algebra and three

relation-to-stream operators that produce streams

from relations.

The approach requires the definition of a rela-

tion R (τ) corresponding to a stream at a certain

point in time (the instantaneous relation).

Definition 2 (Relation) A relation R is a map-

ping from each time instant in T to a finite but

unbounded bag of tuples belonging to the schema

of R .

Furthermore, CQL allows operations integrat-

ing static data with streaming data.

A database models the state of the domain of

discourse at the current time. Queries are evalu-

ated once on the complete database and in return

produce an iterator over the result set. On the other

hand, queries in a DSMS are registered once with

the incoming data streams and are continuously

being evaluated. Thus, the query must produce

records incrementally as it is evaluated against

new incoming tuples.

Whereas some query operators like projection

and selection can be evaluated without state on

a per-tuple basis, other operators, such as join,

aggregation, or sorting, need to consider every

available tuple before producing a result.A require-

ment that cannot be fulfilled due to the potentially

infinite size of data streams. In such cases only

an approximate result can be produced. A natural

way of approximating continuous query results is

to define a window over which the query is evalu-

ated. This is actually often the desired semantics

of queries. For instance, if continuous queries

are used for decision support based on financial

tickers, the user is likely more interested in recent

information than in historic data.

One way to categorize window models is the

way to determine which elements are currently in

the window and which are not, i.e., which records

fulfill the validity criterion of the window defi-

nition. The most important decision criterion in

data stream applications is time. To be usable for

window models the records in a data stream have

to be comparable, i.e., they must be monotonously

sortable according to some ordering. One method

is to use time stamps, where implicit timestamps

(created when the tuple streams in) and explicit

timestamps (an existing attribute of the tuple) can

be distinguished (Babcock et al., 2002). Another

possibility to make records sortable are unique

sequence numbers.

In our traffic state estimation example, we are

mostly interested in monitoring recent data stream-

ing in from the stationary and mobile detectors.

Evolving Application Domains of Data Warehousing and Mining

Search WWH ::

Custom Search

Home