Ranking Queries on Probabilistic Linkages - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

However, to compute the top- k probability of a vertex v in a PME-graph G , only

considering the vertices ranked higher than v may lead to incorrect results. This is

because, a vertex v ranked lower than v may connect the two vertices v 1 and v 2

that are ranked higher than v , therefore, not considering v may not preserve the

dependency between v 1 and v 2 .

We define a dominant graph of v as follows.

Definition 7.6 (Dominant subgraph of a linkage). For a PME-graph G

)

and a vertex v

∈

V , the dominant subgraph of v , denoted by G

(

)

, is the minimum

subgraph in G that contains the vertices v ∈

V satisfying:

1. v

f v ,or

2. v ∈

V J and v is in a path P

v ,···,

v 1

,···,

v 2

in G , where v 1

f v and v 2

f v .

Theorem 7.3 (The dominant graph property). For a PME-graph G

)

and

V,Pr k Q , G (

Pr k Q , G ( v ) (

, where Pr k Q , G (

and Pr k Q , G ( v ) (

a vertex v

∈

)

are the top-k

probabilities of v computed on G and in the dominant subgraph G

(

)

of v, respec-

tively.

Proof. If a vertex v is not in G

, then v does not appear in any path between

the two vertices ranked higher than v . Therefore, the joint probability distribution

of the vertices ranked higher than v does not depend on v , which means whether v

appears or not does not affect the top- k probability of v .

Similar to the discussion in Section 7.3.1, although all vertices in G

(

)

need to

be considered when computing the top- k probability of v , only the vertices in G

(

)

that are ranked higher than v affect the actual rank of v . Therefore, we change the

assignment of the flag attribute value of each vertex in G

(

)

as follows. For a vertex

v ∈

v )=

true and v f v , we set its flag v .

(

)

,if P

(

1, otherwise, we set

v .

Given a vertex v , how can we obtain its dominant subgraph G

? Since a PME-

graph G may contain multiple connected components, we can construct, in each

component G j , the dominant subgraph G j (

(

)

as follows.

We traverse the clique graph of G j in the depth first order. For each clique C ,we

visit its vertices one by one. If a vertex v ∈

)

C satisfies the condition v f v , then we

include v into G j (

. Moreover, we find the path from C to the last visited clique

C whose vertices are included in G j (

)

. All vertices joining the cliques along the

path are included in G j (

too. Since each vertex is at most visited twice during the

construction, the complexity of constructing G j (

)

is O

( |

V G j | )

where

V G j |

is the

number of vertices in G j .

7.3.3 Vertex Compression

For a set of private vertices V p = {

v c 1 ,···,

v c m }

in G

(

)

belonging to the same clique

C , if for any v ∈

V p , v f v , then we can replace them with a single vertex v p where

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home