Databases Reference
In-Depth Information
0: Input: Source S with p-mappings pM 1 ;:::;pM l for M 1 ;:::;M l .
Output: Single p-mapping pM between S and T .
1: For each i 2 Œ1;l , modify p-mapping pM i : Do the following for every possible mapping m
in pM i :
For every correspondence .a;A/ 2 m between source attribute a and mediated attribute
A in M i , proceed as fo llows. (1) Find the set of all mediated attributes B in T such t ha t
B A. Call this set B. (2) Replace .a;A/ in m with the set of all .a;B/'s, where B 2 B.
Call the resulting p-mapping pM i .
2: For each i 2 Œ1;l , modify probabilities in pM i : Multiply the probability of every schema
mapping in pM i by Pr.M i /, which is the probability of M i in the p-med-schema. (Note that
after this step the sum of probabilities of all mappings in pM i is not 1.)
3: Consolidate pM i 's: Initialize pM to be an empty p-mapping (i.e., with no mappings). For
each i 2 Œ1;l, add pM i to pM as follows:
For each schema mapping m in pM i with probability p: if m is in pM , with probability
p 0 , modify the probability of m in pM to .p C p 0 /; if m is not in pM, then add m to pM
with probability p.
4: Return the resulting consolidated p-mapping, pM; the probabilities of all mappings in pM
add to 1.
Algorithm 4: Consolidating p-mappings
is equal to the union of a set of clusters in T . Hence, any two attributes a i and a j
will be together in a cluster in T if and only if they are together in every mediated
schema of M . The algorithm initializes T to M 1
and then modifies each cluster of
T basedonclustersfromM 2
to M l .
Example 10. Consider a p-med-schema M
Df M 1 ;M 2 g
,whereM 1
contains
three attributes
f a 1 ;a 2 ;a 3 g
,
f a 4 g
,and
f a 5 ;a 6 g
,andM 2
contains two attributes
f a 2 ;a 3 ;a 4 g
and
f a 1 ;a 5 ;a 6 g
. The target schema T would then contain four
attributes:
f a 1 g
,
f a 2 ;a 3 g
,
f a 4 g
,and
f a 5 ;a 6 g
.
t
Note that in practice the consolidated mediated schema is the same as the mediated
schema that corresponds to the weighted graph with only certain edges. Here, we
show the general algorithm for consolidation, which can be applied even if we do
not know the specific pairwise similarities between attributes.
Consolidating p-mappings: Next, we consider consolidating p-mappings specified
w.r.t. M 1 ;:::;M l to a p-mapping w.r.t. the consolidated mediated schema T . Con-
sider a source S with p-mappings pM 1 ;:::;pM l for M 1 ;:::;M l , respectively. We
generate a single p-mapping pM between S and T in three steps. First, we modify
each p-mapping pM i ;i 2 Œ1;l; between S and M i
to a p-mapping pM i
between S
and T . Second, we modify the probabilities in each pM i
. Third, we consolidate all
possible mappings in pM i
's to obtain pM. The details are specified in Algorithm 4,
as follows.
Note that the second part of Step 1 can map one source attribute to multiple
mediated attributes; thus, the mappings in the result pM are one-to-many mappings
and so typically different from the p-mapping generated directly on the consoli-
dated schema. The following theorem shows that the consolidated mediated schema
 
Search WWH ::




Custom Search