Top-K Schema Matchings - Uncertain Schema Matching - page 50

Databases Reference

In-Depth Information

Top- K schema matchings can be defined recursively as follows. For K

=

1, the K -th best

matching σ 1

is any maximum weight matching in G satisfying

⊆ E, f (σ, M) ≤ f(σ 1 ,M) .

∀ σ

Let σ i denote the i -th best matching, for any i> 1. Then, given the best i − 1 match-

ings σ 1 ,σ 2 ,...,σ i − 1 , the i -th best matching σ i is defined as a matching of maximum weight

over matchings that differ from each of σ 1 ,σ 2 ,...,σ i − 1 . Therefore, given top- K matchings, any

matching σ

⊆ E such that σ ∈ σ 1 ,σ 2 ,...,σ K satisfies

≤ j ≤ k f(σ j ,M) = f(σ K ,M) .

f(σ,M) ≤ min

1

Example 5.1 Table 3.2 (p.10) represents a similarity matrix of the running case study, illustrated

graphically in Figure 5.1 . Henceforth, we use attribute numbers rather than names for the sake of

clarity. The exact matching (as determined by a human observer) is

{ e 1 , 1 ,e 2 , 2 ,e 3 , 3 ,e 4 , 4 }

, while the

best matching is σ ∗ ={ e 1 , 1 ,e 2 , 2 ,e 3 , 4 ,e 4 , 3 }

.

neigborhood

1

1

neigborhood

city

2

2

city

check In Day

3

3

check In Day

check Out Day

4

4

check Out Day

Figure 5.1: The bipartite graph of the running case study

An intuitive interpretation of top- K matchings is the following. Suppose an edge weight rep-

resents a matcher's belief in the correctness of an attribute correspondence, where a higher weight

indicates greater confidence. When switching from the i -th best matching to the (i + 1 ) best match-

ing, the matcher is forced to give up at least one attribute correspondence, while maintaining an

overall high confidence in the matching. To do so, the matcher cedes an attribute correspondence in

which it is less confident. Therefore, generating top- K matchings can be seen as a process by which

a matcher iteratively abandons attribute correspondences in which it is less confident.

It can be argued that top- K matchings are no different from any randomly chosen K match-

ings. That is, by choosing any K matchings ( not necessarily the top ones), one can derive useful

Next Page

Uncertain Schema Matching

Search WWH ::

Custom Search

Home