Database Reference
In-Depth Information
Proof The CO NP upper bounds follow from Theorem 13.3 . Below we prove the lower
bound for the first claim. The proof for the second claim can be obtained by replacing
with
+ . The third claim is left for the reader (see Exercise 16.3 ).
The reduction is very similar to the one used in the proof of Proposition 13.5 , only the
selection of literals for each clause is done by permuting the data values in the L i -children:
we choose the literal encoded by the data value stored in L 1 .
Thus, the source DTD D s remains the same, the target DTD D t is equal to D s , the st-tgds
Σ st are
r / C [ L 1 ( x ) , L 2 ( y ) , L 3 ( z )]
r / C [ ( x ) , ( y ) , ( z )] ,
r / V ( x , y )
r / V ( x , y ) ,
yr V ( x , y ) , C / L 1 ( x ) , C / L 1 ( y ) .
and the query is
x
Hardness due to queries
Let us now move to the analysis of the query language. One lesson learned from the re-
lational case is that inequality in the query language leads to CO NP-hardness. Since the
usual translation from the relational setting to the XML setting produces mappings from
SM nr (
, =), we immediately get CO NP-hardness even for the simplest mappings (we have
already seen that for more expressive mappings the problem may be undecidable).
SM nr (
Corollary 13.7
There exist a mapping
M ∈
, =) and a query Q
CTQ(
, = ,
=)
such that CERTAIN M ( Q ) is CO NP -complete.
Similarly, allowing any form of horizontal navigation in queries leads to intractability even
for the simplest mappings.
SM nr (
Proposition 13.8 There exist a schema mapping
M ∈
) , and queries Q 1
CTQ(
+ , =) ,
,
, =) and Q 2
CTQ(
,
such that both CERTAIN M ( Q 1 ) and CERTAIN M ( Q 2 )
are CO NP -complete.
Proof The proof is almost identical as for Proposition 13.6 . The mapping uses the same
D s .ThetargetDTD D t is
C V
r
V :@ a 1 , @ a 2
L
C
L :@ b
and the st-tgds
Σ st are
r / C [ L 1 ( x ) , L 2 ( y ) , L 3 ( z )]
r / C [ L ( x ) , L ( y ) , L ( z )] ,
r / V ( x , y )
r / V ( x , y ) .
Intuitively, we choose the literal having more than two following siblings. Since each C
node has at least three L children, clearly at least one literal is chosen for each clause. The
query Q 1 is just
yr L ( x , y ) , C [ L ( x )
L ] . Replacing
x
L
L ] , C [ L ( y )
L
with
+ gives Q 2 .
Search WWH ::




Custom Search