Databases Reference
In-Depth Information
Given a user defined K or a threshold on the minimum certainty, the system can
produce alternative matchings and assign a probability estimate of correctness to
each of them. The probability is based on the similarity measure, as assigned by an
ensemble of matchers. To justify this method, we use the monotonicity principle, as
discussed before.
Equipped with the monotonicity principle, one can generate a probability space
over a set of K matchings, as follows. Let . 1 ; 2 ;:::; k / be the similarity mea-
sures of the top-K matchings . 1 ; 2 ;:::; k / and 1 >0. The probability assigned
with matching i is computed to be:
i
P j D 1 j
p i D
p i is well defined (since 1 >0). Each p i is assigned with a value in Œ0; 1 and
P j D 1 p i D 1. Therefore, .p 1 ;p 2 ;:::;p k / forms a probability space over the set
of top-K matchings. For completeness, we argue that an appropriate interpretation
of this probability space is to consider it to be the conditional probability, given that
the exact matching is known to be within the top-K matchings.
We can create the probability that is assigned with an attribute correspondence
.A i ;A j / by summing all probabilities of schema matchings in which .A i ;A j /
appears. That is, for a probabilistic attribute correspondence .A i ;A j ;p/ we com-
pute p to be:
X
p D
p l
l j .A i ;A j / 2 l
It is worth noting that methods for assigning attributes to alternative schema
mappings were also suggested by other researchers, such as Magnani et al. [ 2005 ].
6
Conclusions
This chapter introduces three recent advances to the state-of-the-art, extending
the abilities of attribute correspondences. Contextual attribute correspondences
associate selection conditions with attribute correspondences. Semantic matching
extends attribute correspondences to be specified in terms of ontological relation-
ship. Finally, probabilistic attribute correspondences extend attribute correspon-
dences by generating multiple possible models, modeling uncertainty about which
one is correct by using probability theory.
These three extensions are individually powerful in extending the expressive
power of attribute correspondences. However, combining them together can gen-
erate an even more powerful model. For example, combining all the three, one
can declare that attribute A subsumes attribute B if the condition C D c holds
true. If C ¤ c, then there is a 70% chance that attribute A is actually subsumed
by attribute B and 30% chance that they are disjoint. Therefore, we conclude by
identifying the challenges and benefits of putting these extensions together.
Search WWH ::




Custom Search