Probabilistic Path Queries on Road Networks - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

After generating the edge weights w e 1 = {

x 1 ,...,

x m }

and w e 2 = {

y 1 ,...,

y n }

, the

joint distribution f e 1 , e 2 (

x i ,

y j )

is randomly generated from the interval

[

p ij )

, where

p ij =

min

{

f e 1 (

x i ) ,

f e 2 (

y j ) } (

) ,

and

p ij =

min

{

f e 1 (

x i ) − ∑ 1 ≤ s ≤ j − 1 f e 1 , e 2 (

x i ,

y s ) ,

f e 2 (

y j ) − ∑ 1 ≤ t ≤ i − 1 f e 1 , e 2 (

x t ,

y j ) },

(

0 or j

) .

The path queries are generated as follows. The weight threshold is set to x %of

the diameter d of the network, where d is the maximal weight of the shortest paths

among pairs of nodes. The probability threshold varies from 0

1to0

9. The start

and the end vertices are randomly selected.

By default, the number of samples of each edge is set to 5, the weight threshold l

is set to 10%

5. 500 samples are used in

the sampling algorithm to estimate the weight probability distribution of each path.

In the hierarchical approximation algorithm, the bucket parameter t is set to 50, and

2-partitionings are used to construct HP-trees. For each different parameter setting,

we run 20 path queries and report the average results.

d , and the probability threshold

is set to 0

8.4.2 Efficiency and Memory Usage

Using data set OL , we evaluate the efficiency of the three probability calculation

methods, the exact method ( Exact ), the bucket approximation method ( Approxima-

tion ), and the sampling based method ( Sampling ). The depth first path search is used.

The results are shown in Figure 8.8. Clearly, the bucket approximation method and

the sampling based method are more efficient than the exact algorithm.

Particularly, the runtime increases as the weight threshold increases

(Figure 8.8(a)), since a larger weight threshold qualifies more paths in the answer

set. The runtime of all three algorithms decreases when the probability threshold in-

creases (Figure 8.8(b)), because fewer paths can pass the threshold when the prob-

ability threshold is high. The runtime of the algorithms decreases slightly as the

variance of the weight samples increases (Figure 8.8(c)). The runtime of the ex-

act algorithm increases significantly when the number of samples of each edge in-

creases, but the runtime of the sampling algorithm and the bucket approximation

algorithm remains stable (Figure 8.8(d)) thanks to the efficient approximation prob-

ability computation.

We also test the efficiency of the P* search algorithm against the depth first

search ( DFS ) (Figure 8.9). Three heuristic estimates are used: constant estimates

( P*+constant ), min-value estimates ( P*+min-value ), and stochastic estimates

( P*+stochastic ). The bucket approximation method is used to compute the path

probability distribution. To show the difference in efficiency of the different meth-

ods more clearly, we increase the weight threshold to 30%

d .

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home