Database Reference
In-Depth Information
5.1 Overview of Sensor Data Compression
System
The goal of the sensor data compression system is to approximate a
sensor data stream by a set of functions. Data compression methods
that we are going to study in this section permit the occurrence of ap-
proximation errors. These errors are characterized by a specific error
norm. Furthermore, a standard approach to sensor data compression is
to segment the data stream into data segments , and then approximate
each data segment, so that a specific error norm is satisfied. For exam-
ple, if we are considering the L norm, then each sensor value of the
data stream is approximated within an error bound .
Let us assume that we have K segments of a data stream. We denote
these segments as g 1 ,g 2 ,...,g K ,where g 1 approximates the data tu-
ples (( t 1 ,v 1 ) ,..., ( t i 1 ,v i 1 )), while g k ,where k =2 ,...,K , approximates
the data items (( t i k− 1 +1 ,v i k− 1 +1 ) , ( t i k− 1 +2 ,v i k− 1 +1 ) ,..., ( t i k ,v i k )). Simi-
lar to [20], we distinguish between two classes of the segments used for
approximation, namely connected segments and disconnected segments .
In connected segments, the ending point of the previous segment is the
starting point of the new segment. On the contrary, in disconnected
segments, the approximation of the new segment starts from the sub-
sequent data item in the stream. Disconnected segments offer more
approximation flexibility and may lead to fewer segments; however, for
linear approximation [35], they necessitate the storage of two data tu-
ples (i.e., start tuple and end tuple) per data segment, as opposed to
connected segments.
Since functions are employed for approximating data segments, only
the approximated data segments are stored in the database, instead
of the raw sensor values of the data stream [64, 50]. A schema for
linear segments is presented in [64], consisting of a table, referred to
as FunctionTable , where each row represents a linear model with at-
tributes start time , end time , slope and intercept (i.e., base) of the
segment. In case of connected segments [20], the end time attribute can
be omitted.
A more generic schema for storing data streams, approximated by
multiple models, was proposed in [50]. It consists of one table, referred
to as the ( SegmentTable ) for storing data segments, and a second table
( ModelTable ) for storing the model functions, as depicted in Figure
2.11 . A tuple of SegmentTable contains the approximation data for a
segment in the time interval [start time , end time] . The attribute
id stands for identification of the model that is used in the segment.
The primary key in the SegmentTable is the start time , while in the
Search WWH ::




Custom Search