Databases Reference
In-Depth Information
3.4.1
Computing Weighted Correspondences
A weighted correspondence between a pair of attributes specifies the degree of
semantic similarity between them. Let S.s 1 ;:::;s m / be a source schema and
T.t 1 ;:::;t n / be a target schema. We denote by C i;j ;i 2 Œ1;m;j 2 Œ1;n; the
weighted correspondence between s i and t j and by w i;j the weight of C i;j .The
first step is to compute a weighted correspondence between every pair of attributes,
which can be done by applying existing schema-matching techniques.
Although weighted correspondences tell us the degree of similarity between pairs
of attributes, they do not tell us which target attribute a source attribute should map
to. For example, a target attribute mailing-address can be both similar to the source
attribute current-addr and to permanent-addr , so it makes sense to map either of
them to mailing-address in a schema mapping. In fact, given a set of weighted
correspondences, there could be a set of p-mappings that are consistent with it. We
can define the one-to-many relationship between sets of weighted correspondences
and p-mappings by specifying when a p-mapping is consistent with a set of weighted
correspondences.
Definition 10 (Consistent p-mapping). A p-mapping pM is consistent with a
weighted correspondence C i;j between a pair of source and target attributes if the
sum of the probabilities of all mappings m 2 pM containing correspondence .i;j/
equals w i;j ;thatis,
X
w i;j D
Pr.m/:
m 2 pM;.i;j / 2 m
A p-mapping is consistent with a set of weighted correspondences C if it is
consistent with each weighted correspondence C 2
t
However, not every set of weighted correspondences admits a consistent p-mapping.
The following theorem shows under which conditions a consistent p-mapping exists,
and it establishes a normalization factor for weighted correspondences that will
guarantee the existence of a consistent p-mapping.
Theorem 3. Let C be a set of weighted correspondences between a source schema
S.s 1 ;:::;s m / and a target schema T.t 1 ;:::;t n / .
C .
There exists a consistent p-mapping with respect to C if and only if (1) for every
i 2 Œ1;m , P j D 1
w i;j 1 and (2) for every j 2 Œ1;n , P i D 1
w i;j 1 .
Let
n X
X
M 0 D max f max i f
w i;j g ;max j f
w i;j gg :
j D 1
i D 1
Then, for each i 2 Œ1;m , P j D 1
1 and for each j 2 Œ1;n , P i D 1
w i;j
M 0 1 .
t
Based on Theorem 3 , we normalize the weighted correspondences we generated
as described previously by dividing them by M 0 ,thatis,
w i;j
M 0
w i;j
M 0 :
w i;j D
 
Search WWH ::




Custom Search