Database Reference
In-Depth Information
b
a 1
1
l 1 (0.6)
l 1 (0.2)
a 1
b
a 1
b
1
1
b 2
a 2
l 4 (0.6)
l 4 (0.4)
b 3
a 2
b 2
a 2
b 2
(a) Compatible.
(b) Compatible.
(c) Incompatible.
Fig. 7.1 Linkage compatibility.
7.2 Linkage Compatibility
In this section, we study the compatibility of a set of linkages and the effect on the
possible world probabilities.
7.2.1 Dependencies among Linkages
The linkage functions defined in Section 2.3.2 give only the probabilities of individ-
ual linkages. This is the situation in all state-of-the-art probabilistic linkage meth-
ods. In other words, existing linkage methods do not estimate the joint probabilities
of multiple linkages. Linkages are not independent - at most one linkage can appear
in a possible world among those associated with the same tuple. Then, what roles
do dependencies play in defining probabilities of possible worlds?
Example 7.1 (Compatible linkages). Consider the linkages shown in Figure 7.1(a)
between tuples in tables A
= {
,
}
= {
,
,
}
. The probabilities of the
linkages are labeled in the figure. For a linkage l, let l and
a 1
a 2
and B
b 1
b 2
b 3
l denote the events
that l appears and l is absent, respectively. Since linkages l 1 and l 2 are mutually
exclusive, the marginal distribution of
¬
(
l 1
,
l 2
)
, denoted by f
(
l 1
,
l 2
)
,isPr
( ¬
l 1
l 2
)=
1
Pr
(
l 1 )
Pr
(
l 2 )=
0
.
6 ,Pr
( ¬
l 1 ,
l 2 )=
Pr
(
l 2 )=
0
.
2 ,Pr
(
l 1
l 2 )=
Pr
(
l 1 )=
0
.
2 ,
and Pr
(
l 1 ,
l 2 )=
0 . Similarly, the marginal distributions f
(
l 2 ,
l 3 )
and f
(
l 3 ,
l 4 )
can be
calculated from the linkage probabilities and the mutual exclusion rules.
Does there exist a set of possible worlds (i.e., the joint distribution f
(
l 1 ,
l 2 ,
l 3 ,
l 4 )
)
that satisfy the marginal distributions f
? If so, can we
further determine the existence probability of each possible world? The answer is
yes in this example. Based on Bayes' theorem, we can compute the joint distribution
(
l 1 ,
l 2 )
,f
(
l 2 ,
l 3 )
and f
(
l 3 ,
l 4 )
(
f
l 1
,
l 2
)
f
(
l 2
,
l 3
)
f
(
l 3
,
l 4
)
f
(
l 1 ,
l 2 ,
l 3 ,
l 4 )=
f
(
l 1 ,
l 2 )
f
(
l 3 |
l 2 )
f
(
l 4 |
l 3 )=
.
f
(
l 2 )
f
(
l 3 )
As another example of compatible linkages, consider Figure 7.1(b). The joint
probabilities are Pr
4 .
Figure 7.1(c) gives an example of incompatible linkages. Linkages in Figure 7.1(c)
have the same mutual exclusion rules as the ones in Figure 7.1(b), but the proba-
(
l 1
l 2
l 3 ,
l 4 )=
0
.
6 and Pr
( ¬
l 1 ,
l 2 ,
l 3
l 4 )=
0
.
 
Search WWH ::




Custom Search