Databases Reference
In-Depth Information
answer tuples, each with a probability indicating the likelihood that the tuple occurs
as an answer. We consider by-table semantics here. Given a query Q, we compute
answers by first answering Q with respect to each possible mapping and then for
each answer tuple t by summing up the probabilities of the mappings with respect
to which t is generated.
We now extend this notion for query answering that takes p-med-schema into
consideration. Intuitively, we compute query answers by first answering the query
with respect to each possible mediated schema and then for each answer tuple by
taking the sum of its probabilities weighted by the probabilities of the mediated
schemas.
Definition 12 (Query Answer). Let S be a source schema and M
Df .M 1 ;Pr
.M 1 //;:::;.M l ;Pr.M l // g
be a p-med-schema. Let pM
Df pM.M 1 /, :::;pM
.M l / g
be a set of p-mappings where pM.M i / is the p-mapping between S and M i .
Let D be an instance of S and Q be a query.
Let t be a tuple. Let Pr.t j M i /;i 2 Œ1;l; be the probability of t in the answer of
Q with respect to M i and pM.M i /.Letp D ˙ i D 1 Pr.t j M i / Pr.M i /.Ifp>0,
then we say .t;p/ is a by-table answer with respect to M and pM .
We denote all by-table answers by Q M ; pM .D/.
t
We say that query answers A 1 and A 2 are equal (denoted A 1 D A 2 )ifA 1 and
A 2 contain exactly the same set of tuples with the same probability assignments.
Expressive power: A natural question to ask at this point is whether probabilistic
mediated schemas provide any added expressive power compared to deterministic
ones. Theorem 8 shows that if we consider one-to-many schema mappings, where
one source attribute can be mapped to multiple mediated attributes, then any combi-
nation of a p-med-schema and p-mappings can be equivalently represented using a
deterministic mediated schema with p-mappings, but may not be represented using a
p-med-schema with deterministic schema mappings. Note that we can easily extend
the definition of query answers to one-to-many mappings, as one mediated attribute
can correspond to no more than one source attribute.
Theorem 8 (Subsumption). The following two claims hold.
1. Given a source schema S , a p-med-schema M , and a set of p-mappings pM
between S and possible mediated schemas in M , there exists a deterministic
mediated schema T and a p-mapping pM between S and T , such that
8 D;Q W
Q M ; pM .D/ D Q T;pM .D/ .
2. There exists a source schema S , a mediated schema T , a p-mapping pM between
S and T , and an instance D of S , such that for any p-med-schema M and any
set m of deterministic mappings between S and possible mediated schemas in
M , there exists a query Q such that Q M ; m .D/ ¤ Q T;pM .D/ .
t
In contrast, Theorem 9 shows that if we restrict our attention to one-to-one map-
pings, then a probabilistic mediated schema does add expressive power.
Search WWH ::




Custom Search