Database Reference
In-Depth Information
4. EXTENSIONAL QUERY EVALUATION
the two atoms, and therefore we cannot apply independent project on
x
2
. Instead, we rank the two
attributes of the relation
R
; in other words, we partition it into three relations:
x,y
(σ
x<y
(R(x, y)))
R
2
=
x
(σ
x
=
y
(R(x, y)))
R
3
=
y,x
(σ
x>y
(R(x, y)))
Then, rewrite
Q
1
as:
Q
1
=
R
1
(x, y), R
3
(x, y)
∨
R
2
(z)
. Note that the relations
R
1
,
R
2
and
R
3
have
no common tuples; hence, the database
R
1
=
is a tuple-independent database. It is easy to see
that
P(Q
1
)
can be computed using the rules in
Subsection 4.1.2
: first, apply an independent-union,
P(Q
1
)
=
1
−
(
1
−
P(R
1
(x, y), R
3
(x, y)))
·
(
1
−
P(R
2
(z)))
, then use
x
is a separator variable in
P(R
1
(x, y), R
3
(x, y))
, etc. Notice that the lineages
Q
1
R
1
,R
2
,R
3
and
Q
1
are the same:
Q
1
=
i,j
X
ij
X
ji
=
i<j
X
ij
X
ji
∨
i
X
ii
=
Q
r
All
n(n
n
conjuncts in the expression above are independent, which is another way to see
why
P(Q
1
)
can be computed in polynomial time.
Second, consider
−
1
)/
2
+
Q
2
=
R(x,a),R(a,x)
where
a
is a constant. Here, too, we cannot apply any rule because
x
is not a separator variable. After
we rank both attributs of
R
w.r.t. the constant
a
, we obtain four relations:
x
1
(σ
x
1
=
a
∧
x
2
=
a
(R(x
1
,x
2
)))
R
2
(x
2
)
=
x
2
(σ
x
1
=
a
∧
x
2
=
a
(R(x
1
,x
2
)))
R
3
()
=
∅
(σ
x
1
=
x
2
=
a
(R))
R
4
(x
1
,x
2
)
=
σ
x
1
=
a
∧
x
2
=
a
(R(x
1
,x
2
)
Then we rewrite the query as
Q
2
=
R
1
(x), R
2
(x)
∨
R
3
()
. This is an independent union of two
queries, the first has the separator variable
x
, while the second is a ground tuple; hence, we simply
lookup its probability in the database.
Finally, consider
Q
3
=
R
1
(x
1
)
=
R(u, v), S(v, u)
. Even though none of the two
conjunctive queries suggests that ranking is needed, the entire query has no separator, and we
must rank, first on
R
then on
S
, which partitions both
R
and
S
into three sets:
R
1
(x, y)
R(x,y),S(x,y)
∨
=
σ
x<y
(R(x, y))
,
R
2
(x)
=
x
(σ
x
=
y
(R(x, y)))
,
R
3
=
yx
(σ
x>y
(R(x, y)))
, and similarly
S
is par-
titioned into
S
1
,S
2
,S
3
. After simplifications, the ranked query is:
Q
3
=
R
1
(x
1
,y
1
), S
1
(x
1
,y
1
)
∨
R
3
(y
2
,x
2
), S
3
(y
2
,x
2
)
∨
R
1
(x
3
,y
3
), S
3
(x
3
,y
3
)
∨
R
3
(y
4
,x
4
), S
1
(y
4
,x
4
)
∨
R
2
(x
5
), S
2
(x
5
)
2
The queries
R(a,y),R(y,a)
and
R(b,y),R(y,b)
are dependent because they both depend on both
R(a, b)
and
R(b, a)
, which
shows that an independent project is not possible.