Database Reference
In-Depth Information
x 5
x 4
x 5
x 4
￿￿￿
( )
x 3
x 3
￿￿
￿￿
￿￿
￿￿
( )
x 2
￿￿￿
￿￿￿
￿￿￿
x 2
￿￿￿
( )
x 1
￿￿￿
￿￿￿
￿￿￿
x 1
( )
￿￿￿
x 1
... ... ...
x t
x 1
x t
... ... ...
w em
The samples of
The samples of
w em
(a) The upper bound.
(b) The lower bound.
Fig. 8.3 The upper/lower bound of F P m + 1 ( l ) .
The number of buckets is at most 2 t , as shown in the following lemma.
(
>
>
)
φ
=
Lemma 8.2 (Number of buckets). Given w P m with n samples
n
2 t
0
, let
i
[
x z i ,
x z i ](
)
2 t.
Proof. In the worst case, each bucket only contains one value in w P m , which means
that the probability sum of any two consecutive values in w P m
1
i
b
be b exclusive buckets satisfying Equation 8.3, then b
1
is greater than
t .
Then, if the number of values in w P m
is greater than 2 t , the probability sum of all
values in w P m
will be greater than 1, which conflicts with the fact that w P m
is a
discrete random variable.
Constructing the buckets only requires one scan of all values in w P m . The min-
imal value in bucket
φ i =[
x z i ,
x z i ]
is min
i )=
x z i , and the maximal value in
φ i is
max
i )=
x z i . When computing the probability distribution of w P m + 1
using w P m ,we
only select one value in each bucket
φ i
w P m as a representative, and assign Pr
i )
to the representative. If min
i )
is selected as a representative, then the so computed
F P m + 1 (
l
)
is greater than F P m + 1 (
l
)
; if max
i )
is used as a representative, then the so
computed F P m + 1 (
l
)
is smaller than F P m + 1 (
l
)
.
Example 8.2 (Bucket Approximation). Consider path P m and edge e m . Let
f P m | e m (
|
)=
.
=
x j
y
0
2 for 1
j
5 .Ift
2 , then all values in w P m are divided into
three buckets:
[
x 1
,
] , [
,
] , [
,
]
.
.
.
2 , respectively.
If the minimal value of each bucket is used as a representative, then x 1 ,x 3 and
x 5 are selected. As a result, the l-weight constrained region of P m + 1 is increased
such that the shaded area is included. So the calculated F P m + 1 (
x 2
x 3
x 4
x 5
x 5
, with probability 0
4 , 0
4 and 0
l
)
is greater than the
actual F P m + 1 (
)
.
If the maximal value of each bucket is used as a representative, then x 2 ,x 4 and
x 5 are selected. The l-weight constrained region of P m + 1 is decreased such that
the shaded area is excluded. So the calculated F P m + 1 (
l
l
)
is smaller than the actual
F P m + 1 (
)
l
.
Therefore, the average value of F P m + 1 (
and F P m + 1 (
l
)
l
)
can be used to approxi-
mate the actual F P m + 1 (
l
)
. The approximation quality is guaranteed by the following
lemma.
 
Search WWH ::




Custom Search