Databases Reference
In-Depth Information
Top- K schema matchings can be defined recursively as follows. For K
=
1, the K -th best
matching σ 1
is any maximum weight matching in G satisfying
E, f (σ, M) f(σ 1 ,M) .
σ
Let σ i denote the i -th best matching, for any i> 1. Then, given the best i 1 match-
ings σ 1 2 ,...,σ i 1 , the i -th best matching σ i is defined as a matching of maximum weight
over matchings that differ from each of σ 1 2 ,...,σ i 1 . Therefore, given top- K matchings, any
matching σ
E such that σ σ 1 2 ,...,σ K satisfies
j k f(σ j ,M) = f(σ K ,M) .
f(σ,M) min
1
Example 5.1 Table 3.2 (p.10) represents a similarity matrix of the running case study, illustrated
graphically in Figure 5.1 . Henceforth, we use attribute numbers rather than names for the sake of
clarity. The exact matching (as determined by a human observer) is
{ e 1 , 1 ,e 2 , 2 ,e 3 , 3 ,e 4 , 4 }
, while the
best matching is σ ={ e 1 , 1 ,e 2 , 2 ,e 3 , 4 ,e 4 , 3 }
.
neigborhood
1
1
neigborhood
city
2
2
city
check In Day
3
3
check In Day
check Out Day
4
4
check Out Day
Figure 5.1: The bipartite graph of the running case study
An intuitive interpretation of top- K matchings is the following. Suppose an edge weight rep-
resents a matcher's belief in the correctness of an attribute correspondence, where a higher weight
indicates greater confidence. When switching from the i -th best matching to the (i + 1 ) best match-
ing, the matcher is forced to give up at least one attribute correspondence, while maintaining an
overall high confidence in the matching. To do so, the matcher cedes an attribute correspondence in
which it is less confident. Therefore, generating top- K matchings can be seen as a process by which
a matcher iteratively abandons attribute correspondences in which it is less confident.
It can be argued that top- K matchings are no different from any randomly chosen K match-
ings. That is, by choosing any K matchings ( not necessarily the top ones), one can derive useful
 
Search WWH ::




Custom Search