Enhancing the Capabilities of Attribute Correspondences - Schema Matching and Mapping

Databases Reference

In-Depth Information

Given a user defined K or a threshold on the minimum certainty, the system can

produce alternative matchings and assign a probability estimate of correctness to

each of them. The probability is based on the similarity measure, as assigned by an

ensemble of matchers. To justify this method, we use the monotonicity principle, as

discussed before.

Equipped with the monotonicity principle, one can generate a probability space

over a set of K matchings, as follows. Let . 1 ; 2 ;:::; k / be the similarity mea-

sures of the top-K matchings . 1 ; 2 ;:::; k / and 1 >0. The probability assigned

with matching i is computed to be:

i

P j D 1 j

p i D

p i is well defined (since 1 >0). Each p i is assigned with a value in Œ0; 1 and

P j D 1 p i D 1. Therefore, .p 1 ;p 2 ;:::;p k / forms a probability space over the set

of top-K matchings. For completeness, we argue that an appropriate interpretation

of this probability space is to consider it to be the conditional probability, given that

the exact matching is known to be within the top-K matchings.

We can create the probability that is assigned with an attribute correspondence

.A i ;A j / by summing all probabilities of schema matchings in which .A i ;A j /

appears. That is, for a probabilistic attribute correspondence .A i ;A j ;p/ we com-

pute p to be:

X

p D

p l

l j .A i ;A j / 2 l

It is worth noting that methods for assigning attributes to alternative schema

mappings were also suggested by other researchers, such as Magnani et al. [ 2005 ].

6

Conclusions

This chapter introduces three recent advances to the state-of-the-art, extending

the abilities of attribute correspondences. Contextual attribute correspondences

associate selection conditions with attribute correspondences. Semantic matching

extends attribute correspondences to be specified in terms of ontological relation-

ship. Finally, probabilistic attribute correspondences extend attribute correspon-

dences by generating multiple possible models, modeling uncertainty about which

one is correct by using probability theory.

These three extensions are individually powerful in extending the expressive

power of attribute correspondences. However, combining them together can gen-

erate an even more powerful model. For example, combining all the three, one

can declare that attribute A subsumes attribute B if the condition C D c holds

true. If C ¤ c, then there is a 70% chance that attribute A is actually subsumed

by attribute B and 30% chance that they are disjoint. Therefore, we conclude by

identifying the challenges and benefits of putting these extensions together.

Schema Matching and Mapping

Search WWH ::

Custom Search

Home