Database Reference
In-Depth Information
data streams. The linear regression model of a data segment is given as:
v
i
=
s
·
t
i
+
b,
(2.15)
where
b
and
s
are known as the base and the slope respectively. The
difference between
v
i
and
t
i
is known as the residual for time
t
i
.For
fitting a linear regression model of Eq. (2.15) to the sensor values
v
i
:
t
i
∈
[
t
b
;
t
e
], the ordinary least squares (OLS) estimator is employed. The
OLS estimator selects
b
and
s
such that they minimize the following sum
of squared residuals:
t
e
t
i
+
b
)]
2
.
RSS
(
b,s
)=
[
v
i
−
(
s
·
t
i
=
t
b
Therefore,
b
and
s
are given as:
v
i
,
t
e
t
b
+
t
e
2
t
i
−
b
=
t
t
i
=
t
b
(
t
i
t
b
+
t
e
2
−
)
t
i
t
i
=
t
b
(2.16)
s
=
t
t
i
=
t
b
v
i
b
t
b
+
t
e
2
t
e
− t
b
+1
−
.
Here, the storage record of each data segment of the data stream consists
of ([
t
b
;
t
e
];
b,s
), where [
t
b
;
t
e
] is the segment interval, and
s
and
b
are the
slope and base of the linear regression, as obtained from Eq. (2.16).
Similarly, instead of the linear regression model, a polynomial regres-
sion model (refer Eq. (2.9)) can also be utilized for approximating each
segment of the data stream. The storage record of the polynomial regres-
sion model is similar to the linear regression model. The only difference
is that for the polynomial regression model the storage record contains
parameters
α
1
,...,α
d
instead of the parameters
b
and
s
.
5.4 Compressing Correlated Data Streams
Several approaches [14, 42, 24] exploit correlations among different
data streams for compression. The GAMPS approach [24] dynami-
cally identifies and exploits correlations among different data segments
and then jointly compresses them within an error bound employing a
polynomial-time approximation algorithm. In the first phase, data seg-
ments are individually approximated based on piecewise constant ap-
proximation (specifically the PMC-Mean described in Section 5.3). In
the second phase, each data segment is approximated by a ratio with
respect to a base segment. The segment formed by the ratios is called
the ratio segment. GAMPS proposes to store the base segment and the