Database Reference
In-Depth Information
BAckground
is collected over a relatively large period of time.
Online streaming applications are characterized
by real-time updated data that needs to be quickly
processed as the data arrives. Predicting frequency
of Internet packet streams is an application of
mining online data streams because the predic-
tion needs to be made in real time. Other potential
online data streaming applications include stock
tickers, network measurements, and evaluation of
sensor data. In online data streaming applications,
data is often discarded soon after it arrives and
has been processed, because of the high update
rate and huge resulting amount of data.
Figure 1 shows an example of wireless sensor
stream applications. In this figure, the sensors
generate a series of sensor streams with domain
information, such as the sensor identifiers, sen-
sor locations, time stamps, sensed values at a
particular time, and power left for the particular
sensor, etc. The information is reported to the
server through single-hop or multiple-hop rout-
ings. General stream mining methodologies do
not have a mechanism to connect the domain
information of each application to the reported
sensor values; correspondingly the discovered
patterns and their relationships with each sensor
could not be connected together.
In this section, the background information is
reviewed and discussed for two main areas: the
sensor stream application domain and the data
warehousing and mining issues needed to be con-
sidered in this specific domain. These are covered
in the following two subsections, respectively.
Sensor Stream Application domain
A data stream is a sequence of items that arrive in a
timely order. Different from data in traditional static
databases, data streams are continuous, unbounded,
usually come with high speed, and have a data value
distribution that often changes with time (Guha,
2001). A data stream is represented mathematically
as an ordered pair (r, Δ) where: r is a sequence of
tuples, Δ is the sequence of time intervals (i.e. rational
or real numbers) and each Δ i > 0.
Applications that rely on data streams can be
classified into offline and online streaming. Of-
fline streaming applications are characterized by
regular bulk arrivals (Manku, 2002). Generating
reports based on accumulated web log streams is
an example of mining offline data streams because
most of reports are made based on log data that
Figure 1. Wireless sensor stream application
Search WWH ::




Custom Search