A SURVEY OF MODEL-BASED SENSOR DATA ACQUISITION AND MANAGEMENT - Managing and Mining Sensor Data

Database Reference

In-Depth Information

5.1 Overview of Sensor Data Compression

System

The goal of the sensor data compression system is to approximate a

sensor data stream by a set of functions. Data compression methods

that we are going to study in this section permit the occurrence of ap-

proximation errors. These errors are characterized by a specific error

norm. Furthermore, a standard approach to sensor data compression is

to segment the data stream into data segments , and then approximate

each data segment, so that a specific error norm is satisfied. For exam-

ple, if we are considering the L ∞ norm, then each sensor value of the

data stream is approximated within an error bound .

Let us assume that we have K segments of a data stream. We denote

these segments as g 1 ,g 2 ,...,g K ,where g 1 approximates the data tu-

ples (( t 1 ,v 1 ) ,..., ( t i 1 ,v i 1 )), while g k ,where k =2 ,...,K , approximates

the data items (( t i k− 1 +1 ,v i k− 1 +1 ) , ( t i k− 1 +2 ,v i k− 1 +1 ) ,..., ( t i k ,v i k )). Simi-

lar to [20], we distinguish between two classes of the segments used for

approximation, namely connected segments and disconnected segments .

In connected segments, the ending point of the previous segment is the

starting point of the new segment. On the contrary, in disconnected

segments, the approximation of the new segment starts from the sub-

sequent data item in the stream. Disconnected segments offer more

approximation flexibility and may lead to fewer segments; however, for

linear approximation [35], they necessitate the storage of two data tu-

ples (i.e., start tuple and end tuple) per data segment, as opposed to

connected segments.

Since functions are employed for approximating data segments, only

the approximated data segments are stored in the database, instead

of the raw sensor values of the data stream [64, 50]. A schema for

linear segments is presented in [64], consisting of a table, referred to

as FunctionTable , where each row represents a linear model with at-

tributes start time , end time , slope and intercept (i.e., base) of the

segment. In case of connected segments [20], the end time attribute can

be omitted.

A more generic schema for storing data streams, approximated by

multiple models, was proposed in [50]. It consists of one table, referred

to as the ( SegmentTable ) for storing data segments, and a second table

( ModelTable ) for storing the model functions, as depicted in Figure

2.11 . A tuple of SegmentTable contains the approximation data for a

segment in the time interval [start time , end time] . The attribute

id stands for identification of the model that is used in the segment.

The primary key in the SegmentTable is the start time , while in the

Search WWH ::

Custom Search

Home