Databases Reference
In-Depth Information
taken to be the same attribute. As another example, consider a query that contains
phone
and
address
.UsingM
3
or M
4
as the mediated schema will unnecessar-
ily favor home address and phone over office address and phone or vice versa. A
system with M
2
will incorrectly favor answers that return a person's home address
together with office phone number. A system with M
5
will also return a person's
home address together with office phone and does not distinguish such answers
from answers with correct correlations.
A probabilistic mediated schema will avoid this problem. Consider a probabilis-
tic mediated schema
M
that includes M
3
and M
4
, each with probability 0.5. For
each of them and each source schema, we generate a probabilistic mapping (Sect.
3
).
For example, the set of probabilistic mappings
pM
for S
1
is shown in Fig.
4.5
a, b.
Now consider an instance of S
1
with a tuple
('Alice', '123-4567', '123, A Ave.',
'765-4321', '456, B Ave.')
and a query
Possible Mapping
Probability
f
(name, name), (hPhone, hPPhone), (oPhone, oPhone),
(hAddr, hAAddr), (oAddr, oAddr)
0.64
g
f
(name, name), (hPhone, hPPhone), (oPhone, oPhone),
(oAddr, hAAddr), (hAddr, oAddr)
g
0.16
f
(name, name), (oPhone, hPPhone), (hPhone, oPhone),
(hAddr, hAAddr), (oAddr, oAddr)
0.16
g
f
(name, name), (oPhone, hPPhone), (hPhone, oPhone),
(oAddr, hAAddr), (hAddr, oAddr)
0.04
g
(a)
Possible Mapping
Probability
f
(name, name), (oPhone, oPPhone), (hPhone, hPhone),
(oAddr, oAAddr), (hAddr, hAddr)
0.64
g
f
(name, name), (oPhone, oPPhone), (hPhone, hPhone),
(hAddr, oAAddr), (oAddr, hAddr)
0.16
g
f
(name, name), (hPhone, oPPhone), (oPhone, hPhone),
(oAddr, oAAddr), (hAddr, hAddr)
0.16
g
f
(name, name), (hPhone, oPPhone), (oPhone, hPhone),
(hAddr, oAAddr), (oAddr, hAddr)
0.04
g
(b)
Answer
Probability
('Alice', '123-4567', '123, A Ave.')
0.34
('Alice', '765-4321', '456, B Ave.')
0.34
('Alice', '765-4321', '123, A Ave.')
0.16
('Alice', '123-4567', '456, B Ave.')
0.16
(c)
Fig. 4.5
The motivating example: (
a
) p-mapping for S
1
and M
3
,(
b
) p-mapping for S
1
and M
4
,
and (
c
) query answers w.r.t.
M
and
pM
.Herewedenote
f
phone, hPhone
g
by
hPPhone
,
f
phone,
oPhone
g
by
oPPhone
,
f
address, hAddr
g
by
hAAddr
,and
f
address, oAddr
g
by
oAAddr