Database Reference
In-Depth Information
efficient methods using approximate quantile summaries. The central idea is to use
quantiles to summarize instances in streams. Since computing exact quantiles in
data streams is costly, we seek for high quality approximation.
Both the exact algorithm in Section 6.1 and the sampling method in Section 6.2
can be applied on the approximate quantile summaries of uncertain data streams.
Using quantiles is a trade-off between space requirement and query answering ac-
curacy. The distribution of an object is represented in a higher granularity level
using quantiles, and thus the query results are approximate. However, we show that
using approximate quantiles can save substantial space in answering top- k queries
on uncertain streams with high quality guarantees.
6.3.1 Top-k Probabilities and Quantiles
Definition 6.4 (Quantile). Let o 1 ≺···≺
o ω
be the sorted list of instances in the
ranking order in a sliding window W t
1) of W t
ω (
O
)
. The
φ
-quantile (0
< φ
ω (
O
)
-quantile summary of W t
is instance o φω .A
φ
ω (
O
)
is o 1 and a list of instances
1
φ )
o i φω (
1
i
.
-quantile summary of W t
partitions the instances of W t
1
φ
The
φ
ω (
O
)
ω (
O
)
into
intervals (in the values of the ranking function), with
φω
instances in each in-
1
terval. The first interval t 1 =[
o 1 ,
o φω ]
. Generally, the i -th (1
<
i
φ
) interval
1
ω
t i =(
o ( i 1 )φω ,
o i φω ]
. Since the membership probability of each instance is
, the
1
ω φω = φ
membership probability of each interval t i is Pr
(
t i )=
.
Example 6.4 (A quantile summary). Consider a windowW 9 (
O
)
where the sorted list
. Then, 12 , 5 and 2 are the 3 ,
2
3 , and
of instance scores is
(
21 , 20 , 12 , 10 , 9 , 5 , 4 , 3 , 2
)
1 -quantiles of W 9 (
1
O
)
, respectively. The
3 -quantile summary of O is
(
21
,
12
,
5
,
2
)
which partitions the instances into three intervals: t 1 =[
21
,
12
]
,t 2 =(
12
,
5
]
, and
. The membership probability of each interval is 3 .
t 3 =(
5
,
2
]
We can use quantiles to approximate the top- k probabilities of streams.
Maximum value
Minimum value
￿
a 1 ￿
a 2
a 3
A
B
￿
￿
￿
￿
￿
￿
￿
￿
b 1
b 2
￿ b 3
￿
￿
￿
￿
￿
￿
￿
￿
c 1
c 2
c 3 ￿
C
￿
￿
￿
￿
￿
Score decreasing, rank decreasing order
Fig. 6.2 The quantile summaries of three streams.
 
Search WWH ::




Custom Search