Ranking Queries on Probabilistic Linkages - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

and Pr

(

C 2 )=

1. Similarly, we can show that other clique C i (1

≤

4) only contains

2 vertices and the probability sum of the two vertices is 1.

Case 2: The cycle in G contains k vertices v C 1 ,···,

4), as illustrated in

Figure 7.3(c). The corresponding cliques in the PME-graph are C 1 ,···,

v C k

( k

C k , respec-

tively. Let v i be the vertex contained by C i − 1 and C i (2

≤

k ) and v 1 be contained

by C 1 and C k . We show that, for any clique C i , Pr

(

v i )+

(

v i + 1

The joint distribution of v i + 2 and v i + 3 can be expressed as

(

v i + 2 v i + 3

( ¬

)

(

|¬

)

(

|¬

)

v i + 1

v i + 4

v i + 2

v i + 1

v i + 3

v i + 4

Since v i + 2 and v i + 3 belong to the same clique C i + 2 ,wehave Pr

(

v i + 2 v i + 3

0. More-

over, Pr

(

v i + 2 |¬

v i + 1 ) >

0 and Pr

(

v i + 3 |¬

v i + 4 ) >

0. Therefore, Pr

( ¬

v i + 1 ¬

v i + 4 )=

We can express Pr

( ¬

v i + 1 ¬

v i + 4 )

( ¬

v i + 1 ¬

v i + 4 )=

( ¬

v i + 1 ¬

v i + 4 v i v i + 5 )+

( ¬

v i + 1 ¬

v i + 4 v i ¬

v i + 5 )

(7.1)

Since all probability values are non-negative, each component in Equation 7.1 has

to be 0. Therefore, we have

( ¬

v i + 1 ¬

v i + 4 ¬

v i v i + 5 )+

( ¬

v i + 1 ¬

v i + 4 ¬

v i ¬

v i + 5 )=

( ¬

v i + 1 ¬

v i + 4 v i v i + 5 )=

(

v i v i + 5 )

( ¬

v i + 1 |

v i )

( ¬

v i + 4 |

v i + 5 )=

(7.2)

and

( ¬

v i + 1 ¬

v i + 4 ¬

v i v i + 5 )=

( ¬

v i v i + 5 )

( ¬

v i + 1 |¬

v i )

( ¬

v i + 4 |

v i + 5 )=

(7.3)

In Equation 7.2, since Pr

( ¬

v i + 1 |

v i )=

( ¬

v i + 4 |

v i + 5 )=

1, we have Pr

(

v i v i + 5 )=

0. Therefore, in Equation 7.3, Pr

( ¬

v i v i + 5 )=

(

v i + 5 ) −

(

v i v i + 5 ) >

0. Thus,

−

(

v i + 1 ) −

(

v i )

( ¬

v i + 1 |¬

v i )=

0. Since Pr

( ¬

v i + 1 |¬

v i )=

,wehave Pr

(

v i + 1 )+

( ¬

v i )

For example, Figures 7.1(a) and (b) satisfy the first and the second conditions in

Theorem 7.2, respectively. The links there are compatible. Figure 7.1(c) does not

satisfy Theorem 7.2, thus, the linkages there are not compatible.

(

v i )=

1. Therefore, C i only has 2 vertices

{

v i ,

v i + 1 }

and Pr

(

v i + 1 )+

(

v i )=

7.2.4 Resolving Incompatibility

Given a set of incompatible linkages

, intuitively, we want to find a subset of

such that the loss of information is minimized.

How can we measure the amount of information that is retained in a subset of

linkages? Different definitions can be adopted in different applications. For instance,

in Example 2.12, if medical experts are more interested in the patients in the hos-

pitalization registers, then we may want to select a subset of compatible linkages

such that the expected number of patients who are linked to another record in the

causes-of-death registers is maximized.

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home