Database Reference
In-Depth Information
Then, how can we compute the upper bound and the lower bound of the top-
k probability of a stream in a sliding window using its
ε
-approximate
φ
-quantile
summaries?
Consider an interval t i =(
o i 1 ,
o i ]
in an
ε
-approximate
φ
-quantile summary. Sup-
pose R
are the actual rank of o i 1 and o i , respectively. Then, the
actual number of instances in t i is R
(
o i 1
)
and R
(
o i )
(
o i )
R
(
o i 1
)
. The membership probability of
R
(
o i )
R
(
o i 1 )
t i is Pr
(
t i )=
. However, since o i 1 and o i are approximations of the
ω
(
) φ
-quantiles, respectively, their actual ranks are not calculated. In-
stead, we use Pr
i
1
- and i
φ
(
t
)=φ
to approximate the membership probability of t i . Since
((
i
1
ε)ω
R
(
o i 1 ) ((
i
1
)φ +ε)ω
, and
(
i
φ ε)ω
R
(
o i ) (
i
φ +ε)ω
,
) Pr
we have
.
Then, we compute the upper bound and the lower bound of Pr k
Pr
(
t
(
t
)
2
ε
(
t
)
, denoted by
U (
and L (
Pr k
Pr k
(
t
))
(
t
))
, respectively, using the approximate membership proba-
bility Pr
(
)
, following with Theorem 6.6. In sequel, we can further derive the upper
bound and the lower bound of a stream by summing up the upper bound and the
lower bound of all intervals, respectively. The above approximation method has the
following quality guarantee.
t
Theorem 6.8 (Approximation quality). Given a stream O in a sliding window W,
let U (
and L (
Pr k
Pr k
be the upper and lower bounds of Pr k
(
))
(
))
(
)
O
O
O
computed
using the
ε
φ
(
)
-approximate
-quantile summary of W
O
, then,
U ( Pr k
ε
Pr k
(
O
)) −U (
(
O
))
(6.5)
and
L (
ε
Pr k
Pr k
(
O
)) −L (
(
O
))
(6.6)
Proof. Consider an
O . To analyze the approximation
error introduced by o i , we first assume that other quantiles o j (1
ε
-approximate i
φ
-quantile o i
1
φ
j
, j
=
i )
are exact. Suppose the real rank of o i is R
(
o i
)
, according to the definition of
ε
-
approximate quantile, we have
(
i
φ ε)ω
R
(
o i
) (
i
φ +ε)ω
.
are two intervals partitioned by o i . The approx-
imate numbers of instances in t i and t i + 1 are both
t i =(
o i 1 ,
o i ]
and t i + 1 =(
o i ,
o i + 1 ]
φω
.
If R
(
o i ) < φ
, then the actual number of instances in t i is R
(
o i )
1
<
φω
, and the actual number of instances in t i + 1 is
(φ +
1
R
(
o i ) > φω
. That is,
there are
φω
R
(
o i ) εω
instances that are actually in t i + 1 , but are counted into
t i due to the
ε
-approximate quantile o i . Thus, the error introduced by o i is at most
φ ( U (
Pr k
Pr k
(
t i )) −U (
(
t i + 1 )))
. Similarly, if R
(
o i ) < φ
, the error introduced by
φ ( U (
Pr k
Pr k
o i is at most
(
t i + 1 )) −U (
(
t i )))
.
Generally,
the
maximum
overall
approximation
error
introduced
by
ε
-approximate quantiles o 1 ,...,
o 1
φ 1 is
U (
= ∑
φ ( U (
1
φ 1
i
Pr k
Pr k
Pr k
Pr k
(
O
)) −U (
(
O
))
(
t i )) −U (
(
t i + 1 )))
=
1
φ ( U (
Pr k
Pr k
(
t 1 )) −U (
(
t 1
φ ))) ε
 
Search WWH ::




Custom Search