A SURVEY OF MODEL-BASED SENSOR DATA ACQUISITION AND MANAGEMENT - Managing and Mining Sensor Data

Database Reference

In-Depth Information

ratio segment, instead of storing the original data segment. The idea

here is that, in practice, the ratio segment is flat and therefore can be

significantly compressed as compared to the original data segment.

Furthermore, the objective of the GAMPS approach is to identify a

set of base segments, and associate every data segment with a base seg-

ment, such that the ratio segment can be used for reconstructing the

data segment within a L ∞ error bound. The problem of identification

of the base segments is posed as a facility location problem. Since this

problem is NP-hard, a polynomial-time approximation algorithm is used

for solving it, and producing the base segments and the assignment be-

tween the base segments and data segments.

Prior to GAMPS, Deligiannakis et al. [14] proposed the self-based

regression (SBR) algorithm that also finds a base-signal for compressing

historical sensor data based on spatial correlations among different data

streams. The base-signal for each segment captures the prominent fea-

tures of the other signals, and SBR finds piecewise correlations (based

on linear regression) to the base-signal. Lin et al. [42] proposed an algo-

rithm, referred to as adaptive linear vector quantization (ALVQ), which

improves SBR in two ways: (i) it increases the precision of compres-

sion, and (ii) it reduces the bandwidth consumption by compressing the

update of the base signal.

5.5 Multi-Model Data Compression

The potential burstiness of the data streams and the error introduced

by the sensors often result in limited effectiveness of a single model for

approximating a data stream within the prescribed error bound. Ac-

knowledging this, Lazaridis et al. [39] argue that a global approximation

model may not be the best approach and mention the potential need for

using multiple models. In [40], it is also recognized that different ap-

proximation models are more appropriate for data streams of different

statistical properties. The approach in [40] aims to find the best model

approximating the data stream based on the overall hit ratio (i.e., the

ratio of the number of data tuples fitting the model to the total number

of data tuples).

Papaioannou et al. [50] aim to effectively find the best combination

of different models for approximating various segments of the stream

regardless of the error norm. They argue that the selection of the most

ecient model depends on the characteristics of the data stream, namely

rate, burstiness, data range, etc., which cannot be always known apriori

for sensors and they can even be dynamic. Their approach dynamically

adapts to the properties of the data stream and approximates each data

Search WWH ::

Custom Search

Home