Database Reference
In-Depth Information
To compute the distribution of
P
, there are overall
m
−
1 steps. Thus, the conclu-
sion holds.
The complexity is analyzed as follows. Lemma 8.2 shows that there are at most
2
t
buckets constructed in the approximation. Therefore, calculating the approximate
f
P
i
+
1
(
x
)
from
f
P
i
and
f
e
i
(
x
)
(
i
≥
2) takes
O
(
t
×|
w
e
i
|
)
time. The overall complexity
of computing
f
m
+
1
(
x
)
is
O
(
t
∑
2
≤
i
≤
m
|
w
e
i
|
)
.
8.1.3 Estimating l-Weight Probabilities
The
l
-weight probability of a path can also be estimated using sampling. For a path
P
=
v
1
,...,
v
m
+
1
, let
X
P
be a random variable as an indicator to the event that
w
P
≤
l
.
X
P
=
1if
w
P
≤
l
;
X
P
=
0 otherwise. Then, the expectation of
X
P
is
E
[
X
P
]=
F
P
(
l
)
.
, we draw a set of samples uniformly with replacement. Each
sample unit
s
is an observation of the path weight, which is generated as follows. At
first,
s
is set to 0. Then, for edge
e
1
∈
To estimate
E
[
X
P
]
P
, we choose a value
x
1
in
w
e
1
following the
probability distribution
f
e
1
(
x
)
. Then, for each edge
e
i
∈
P
(2
≤
i
≤
m
), we choose a
value
x
i
in
w
e
i
. The chosen
value is added to
s
. Once the weight values of all edges have been chosen, we com-
pare
s
with
l
.If
s
following the probability distribution of
f
e
i
|
e
i
−
1
(
x
|
x
i
−
1
)
≤
l
, then the indicator
X
P
for
s
is set to 1, otherwise, it is set to
0.
We repeat the above procedure until a set of samples
S
are obtained. The mean
of
X
P
in
S
is
E
S
[
. If the sample size is
sufficiently large, the approximation quality is guaranteed following with the well
known Chernoff-Hoeffding bound [182].
X
P
]
, which can be used to estimate
E
[
X
P
]
Theorem 8.3 (Sample size).
Given a path P, for any
δ
(
0
<
δ
<
1
),
ε (ε
>
0
)
, and
3ln
2
δ
a set of samples S of P, if
.
Proof.
The theorem is an immediate application of the well known Chernoff-
Hoeffding bound [182].
|
S
|≥
then
Pr
{|
E
S
[
X
P
]
−
E
[
X
P
]
|>
ε
}≤
δ
2
ε
The complexity of estimating
E
[
X
P
]
is
O
(
|
S
|·|
P
|
)
, where
|
S
|
is the number of
samples drawn and
|
P
|
is the number of edges in
P
.
8.1.4 A Depth First Search Method
Straightforwardly, the
depth-first path search method
can be used to answer a path
query
Q
l
(
. The search starts at
u
. Each time when a new vertex
v
i
is visited, the
weight probability mass function of the path
P
u
,
v
)
=
u
,...,
v
i
is calculated, using one
of the three methods discussed in this section. If
v
i
=
v
and
F
P
(
l
)
≤
τ
, then
P
is
added to the answer set.