Database Reference
In-Depth Information
4. EXTENSIONAL QUERY EVALUATION
4.1.4 EXAMPLES OF SAFE (TRACTABLE) QUERIES
We now illustrate how to use the rules in
Subsection 4.1.2
to evaluate query probabilities. Whenever
this is possible, the query is tractable.
Example 4.6
A Really Simple Query
We start with a simple Boolean query:
Q
=
R(x),S(x,y)
Write it as
Q
=∃
x.(R(x)
∧∃
y.S(x,y))
, and we evaluate it as follows:
P (Q)
=
1
−
(
1
−
P(R(a)
∧∃
y.S(a,y)))
by
Eq. (4.2)
a
∈
ADom(D)
=
1
−
(
1
−
P(R(a))
·
P(
∃
y.S(a,y)))
by
Eq. (4.1)
a
∈
ADom(D)
⎛
⎞
⎝
1
⎠
=
1
−
−
P(R(a))
·
(
1
−
(
1
−
P (S(a, b))))
by
Eq. (4.2)
a
∈
ADom(D)
b
∈
ADom(D)
The last line is an expression of size
O(n
2
)
, where
n
is the size of the active domain
ADom(D)
of
the database.
A Query With Self-Joins
Consider the Boolean query:
Example 4.7
Q
J
=
R(x
1
), S(x
1
,y
1
), T (x
2
), S(x
2
,y
2
)
Note
Q
2
, where
Q
1
=
∃
x
1
.
∃
y
1
.R(x
1
), S(x
1
,y
1
)
and
Q
2
=∃
x
2
.
∃
y
2
.T (x
2
), S(x
2
,y
2
)
. However, we cannot apply an
independent-join because the two queries are dependent (they share the
S
symbol), but we can
apply the inclusion-exclusion formula and obtain:
that
Q
J
is
the
conjunction
of
two
Boolean
queries
Q
1
∧
P(Q
J
)
=
P(Q
1
)
+
P(Q
2
)
−
P(Q
1
∨
Q
2
)
by
Eq. (4.7)
where
Q
U
=
Q
1
∨
Q
2
is the same query as that given in
Eq. (4.4)
. Thus,
P(Q
J
)
is expressed in
terms of the probability of three other queries: the first two can be computed as in the previous
example, the third is
Q
U
and, as we have seen, has a separator variable, therefore:
P(Q
U
)
=
−
−
P(
∃
.y
1
.R(a)
∧
S(a,y
1
)
∨∃
y
2
.T (a)
∧
S(a,y
2
)))
1
(
1
by
Eq. (4.2)
a
∈
ADom(D)
=
1
−
(
1
−
P ((R(a)
∨
T(a))
∧∃
.y.S(a, y)))
a
∈
ADom(D)
=
1
−
(
1
−
P ((R(a)
∨
T (a)))
·
P(
∃
.y.S(a, y)))
by
Eq. (4.1)
a
∈
ADom(D)