Answering tuple queries - Foundations of Data Exchange

Database Reference

In-Depth Information

Proof The CO NP upper bounds follow from Theorem 13.3 . Below we prove the lower

bound for the first claim. The proof for the second claim can be obtained by replacing

with

+ . The third claim is left for the reader (see Exercise 16.3 ).

The reduction is very similar to the one used in the proof of Proposition 13.5 , only the

selection of literals for each clause is done by permuting the data values in the L i -children:

we choose the literal encoded by the data value stored in L 1 .

Thus, the source DTD D s remains the same, the target DTD D t is equal to D s , the st-tgds

Σ st are

↓

r / C [ L 1 ( x ) , L 2 ( y ) , L 3 ( z )]

→

r / C [ ( x ) , ( y ) , ( z )] ,

r / V ( x , y )

→

r / V ( x , y ) ,

yr V ( x , y ) , C / L 1 ( x ) , C / L 1 ( y ) .

and the query is

∃

Hardness due to queries

Let us now move to the analysis of the query language. One lesson learned from the re-

lational case is that inequality in the query language leads to CO NP-hardness. Since the

usual translation from the relational setting to the XML setting produces mappings from

SM nr (

, =), we immediately get CO NP-hardness even for the simplest mappings (we have

already seen that for more expressive mappings the problem may be undecidable).

↓

SM nr (

Corollary 13.7

There exist a mapping

M ∈

↓

, =) and a query Q

∈

CTQ(

↓

, = ,

such that CERTAIN M ( Q ) is CO NP -complete.

Similarly, allowing any form of horizontal navigation in queries leads to intractability even

for the simplest mappings.

SM nr (

Proposition 13.8 There exist a schema mapping

M ∈

↓

) , and queries Q 1 ∈

CTQ(

↓

+ , =) ,

→

, =) and Q 2 ∈

CTQ(

↓

→

such that both CERTAIN M ( Q 1 ) and CERTAIN M ( Q 2 )

are CO NP -complete.

Proof The proof is almost identical as for Proposition 13.6 . The mapping uses the same

D s .ThetargetDTD D t is

C ∗ V ∗

→

V :@ a 1 , @ a 2

L ∗

→

L :@ b

and the st-tgds

Σ st are

r / C [ L 1 ( x ) , L 2 ( y ) , L 3 ( z )]

→ r / C [ L ( x ) , L ( y ) , L ( z )] ,

r / V ( x , y )

→ r / V ( x , y ) .

Intuitively, we choose the literal having more than two following siblings. Since each C

node has at least three L children, clearly at least one literal is chosen for each clause. The

query Q 1 is just

yr L ( x , y ) , C [ L ( x )

L ] . Replacing

∃

→

L ] , C [ L ( y )

→

with

+ gives Q 2 .

→

Foundations of Data Exchange

Search WWH ::

Custom Search

Home