Database Reference
In-Depth Information
4. EXTENSIONAL QUERY EVALUATION
4.1.6.2 Hierarchical Queries and Non-Repeating Queries
For the entire relational calculus, it is unknown whether the rules
R
6
are complete, or whether
RC admits a dichotomy. However, we know a necessary condition and a sufficient condition for
tractability. We discuss these two conditions here.
Definition 4.24
A query expression given by
Eq. (2.1)
is called
hierarchical
if for every subexpression
∃
x.Q
, the variable
x
occurs in all atoms of
Q
(i.e.,
x
is a root variable in
Q
). A query is called
hierarchical
if it is equivalent to a hierarchical expression. For a query language
L
, we denote the class
H
; in particular,
RC
H
of hierarchical queries in
L
by
L
denotes the class of hierarchical, relational
queries.
For a simple illustration, consider
Q
=
R(x),S(x,y)
. This query is in
RC
H
because it can
be expressed as
∧
∃
y.S(x,y)
, and the variable
y
is a root variable in the expression
S(x,y)
. Notice that the notion
of a hierarchical query is a semantic notion: A query expression is hierarchical if
there exists
an
expression equivalent to the query where each variable is a root variable. For example, the query
R(x),S(x,y),S(z,y)
is in
RC
H
: while this expression is not hierarchical, the query is equivalent
to
R(x),S(x,y)
, which is hierarchical. On the other hand, the query
H
0
=
R(x),S(x,y),T(y)
is
not in
RC
H
. We will prove below that there exists no hierarchical expression equivalent to
H
0
.
We have:
∃
x.R(x)
∧∃
y.S(x,y)
: the variable
x
is a root variable in the expression
R(x)
RC
H
).
If a query Q is
R
6
-safe then it is hierarchical (i.e., Q
∈
Proposition 4.25
Proof.
We prove the following, for each of the six rules expressing
P (Q)
in terms of the probabilities
of several subqueries
P(Q
)
: if all subqueries
Q
on the right-hand-side of the rule are hierarchical,
then
Q
is also hierarchical. We illustrate a few cases. In the case of an independent join,
Q
=
Q
1
∧
Q
2
: if both subqueries
Q
1
and
Q
2
are hierarchical, then obviously
Q
is hierarchical too. In the case
of an independent project,
Q
=∃
x.Q
:if
Q
[
a/x
]
is hierarchical for some arbitrary constant
a
, then
so is
Q
because
x
is a root variable. Finally, for the inclusion exclusion formula
Q
=
Q
1
∧
...
∧
Q
k
:
if all subqueries
i
∈
s
Q
i
for
s
⊆[
k
]
are hierarchical, then each
Q
i
is hierarchical, by setting
s
={
i
}
,
hence
Q
=∧
i
Q
i
is also hierarchical.
In
general,
the
converse
fails.
For
example,
all
queries
H
1
,H
2
,H
3
,...
shown
in
Subsection 4.1.3
are
hierarchical,
yet
they
are
all
unsafe
(and
intractable).
One
can
H
1
=
even
find
a
conjunctive
query
that
is
hierarchical
and
unsafe;
for
example,
R(x
0
), S(x
0
,y
0
), S(x
1
,y
1
), T (y
1
)
.
After
applying
the
inclusion-exclusion
formula,
P (Q)
=
P(R(x
0
), S(x
0
,y
0
))
+
P(S(x
1
,y
1
), T (y
1
))
−
P(R(x
0
), S(x
0
,y
0
)
∨
S(x
1
,y
1
), T (y
1
))
,
and
the
third query is
H
1
, proving that
H
1
is unsafe (we get stuck at
H
1
), and therefore it is hard for
#P (by
Theorem 4.23
).