Database Reference
InDepth Information
P
ROBLEM
:
(
Q
)
CERTAIN
M
I
NPUT
:
Source tree
S
, tuple
s
.
Q
UESTION
:
s
∈
certain
M
(
Q
,
S
) ?
In what follows we investigate how the complexity of this problem depends on the fea
tures allowed in schemas, sttgds and queries.
13.2 An upper bound
We have seen that in the relational case the certain answers problem for CQs with inequal
ities is in
CO
NP. In the XML setting this is no longer the case: the problem can be un
decidable for some mappings and queries. This happens either with mappings and queries
permitting downward navigation, or child/nextsibling navigation, as long as inequalities
are allowed in queries. Specifically, the following is true.
+
, or the child and nextsibling
Proposition 13.2
Let
σ
include either downward axes
↓
,
↓
axes
↓
and
→
. Then there exist a mapping
M ∈
SM(
σ
)
and a query Q
∈
CTQ(
σ
,
∼
)
such
that
CERTAIN
M
(
Q
)
is undecidable.
As soon as we forbid inequalities in queries, we get the
CO
NP upper bound back, but
the proof is more involved. Below, we only give a sketch of the main ideas.
Theorem 13.3
For every schema mapping
M
from
SM(
⇓
,
⇒
,
∼
)
and every query Q from
UCTQ(
⇓
,
⇒
,
=)
, the problem
CERTAIN
M
(
Q
)
is in
CO
NP
.
Proof idea
Take a query
Q
∈
UCTQ(
⇓
,
⇒
,
∼
), an XML schema mapping
M
=
(
S
s
,
S
t
,
Σ
st
), and a source tree
S
conforming to
S
s
. Without loss of generality we can
assume that
Q
is a Boolean query.
A tree conforming to the target schema is a solution for
S
iff it satisfies every pattern
from the following set:
(
a
,
z
)
ϕ
(
a
,
b
)
Δ
S
,M
=
{
ψ
(
x
,
y
)
→∃
z
ψ
(
x
,
z
)
∈
Σ
st
and
S

=
ϕ
}
.
Note that for a fixed mapping the set
Δ
S
,M
can be computed in P
TIME
.
The certain answer to
Q
is
false
iff there exists a
counterexample
tree
T
such that
1.
T

=
S
t
,
2.
T

=
Δ
S
,M
,and
3.
T

=
Q
.
Assume that there exists such a counterexample
T
. Fix a set of nodes witnessing
Δ
S
,M
.
Observe that since
Q
does not use
=, we can assume that all the nonwitnessing nodes store
unique data values. Under this assumption, we will show that we can trim most of the non
witnessing nodes without satisfying
Q
, or violating
S
t
,sothat
T
becomes of polynomial
size. This gives an NP algorithm for checking whether certain answers are false: simply
guess a tree
T
of polynomial size, and check if it satisfies the three conditions of being a