Databases Reference
In-Depth Information
Definition 8 (By-tuple Answer). Let pM D .S;T; m / be a p-mapping. Let Q be
a query over T and D S be an instance of S with d tuples.
L et t be a tuple. Let seq .t/ be the subset of seq
.pM/, such that for each seq
2
d
seq .t/ and for each D T 2
Ta r seq .D S /, t 2 Q.D T /.
Let p D P seq 2 seq .t /
Pr . seq /.Ifp>0, we call .t;p/ a by-tuple answer of Q with
respect to D S
and pM.
t
is denoted by Q table .D S /
The set of by-table answers for Q with respect to D S
is denoted by Q t u ple .D S /.
and the set of by-tuple answers for Q with respect to D S
Example 3. Consider the p-mapping pM, the source instance D S , and the query Q
in the motivating example.
In by-table semantics, Fig. 4.3 b shows a target instance that is consistent with
D S (repeated in Fig. 4.3 a) and possible mapping m 1 . Figure 4.3 d shows the by-
table answers of Q with respect to D S
and pM. As an example, for tuple t D
(“Sunnyvale”), we have
m.t/ Df m 1 ;m 2 g
, so the possible tuple (“Sunnyvale,” 0.9)
is an answer.
In by-tuple semantics, Fig. 4.3 c shows a target instance that is by-tuple consistent
with D S and the mapping sequence <m 2 ;m 3 >. Figure 4.3 e shows the by-tuple
answers of Q with respect to D S
D
(“Sunnyvale”) in the by-table answers is different from that in the by-tuple answers.
We describe how to compute the probabilities in detail in the next section.
and pM. Note that the probability of tuple t
t
3.3
Query Answering
This section studies query answering in the presence of probabilistic mappings. We
start with describing algorithms for returning all answer tuples with probabilities,
and discussing the complexity of query answering in terms of the size of the data
( data complexity ) and the size of the p-mapping ( mapping complexity ). We then
consider returning the top-k query answers, which are the k answer tuples with the
top probabilities, and answering aggregate queries.
3.3.1
By-table Query Answering
In the case of by-table semantics, answering queries is conceptually simple. Given
a p-mapping pM D .S;T; m / and an SPJ query Q, we can compute the cer-
tain answers of Q under each of the mappings m 2
m . We attach the probability
Pr.m/to every certain answer under m. If a tuple is an answer to Q under multiple
mappings in m , then we add up the probabilities of the different mappings.
Algorithm B Y T ABLE takes as input an SPJ query Q that mentions the relations
T 1 ;:::;T l
in the FROM clause. Assume that we have the p-mapping pM i
associated
with the table T i . The algorithm proceeds as follows.
 
Search WWH ::




Custom Search