XML data exchange via relations - Foundations of Data Exchange

Database Reference

In-Depth Information

mapping

M

query Q

target T

source S

answer

XML:

shred

σ ( M )

σ ( Q )

Relations:

σ

( S )

σ

( T )

answer

Figure 15.1 XML data exchange via relations: desiderata

In a native system, we would materialize some solution T over which we could answer

queries Q .

But now we want a relational system to do the job. So we shred

S

into

σ

(

S

) and then

apply to

) to get a solution - which itself is a

shredding of an XML solution - so that the answer to Q could be reconstructed from the

result of the query

σ

(

S

) the translation of the mapping

σ

(

M

( Q ) over that relational solution.

The idea seems simple and natural on the surface, but starts looking challenging once we

look deeper into it. Before even attempting to show that the relational translation faithfully

represents the XML data-exchange problem, we need to address the following.

σ

•

Complexity mismatch . Without restrictions, there cannot be a faithful representation of

XML data exchange by a relational system. Indeed, we have seen in PART TWO that

unions of conjunctive queries can be efficiently evaluated in relational data exchange, by

means of constructing the canonical universal solution, and applying naıve evaluation.

At the same time, we saw in Chapter 13 that answering analogs of unions of conjunctive

queries in XML data exchange can be CO NP-complete. So any claim that a relational

data-exchange system correctly performs XML data exchange for arbitrary documents

and queries is bound to be wrong. We thus need to identify the cases that can be handled

by a relational system.

•

Which shredding scheme to use? There are several, which can roughly be divided into

two groups. The first group consists of shredding schemes that do not take the schema

information into account (for instance, representing edge relations of the trees, or in-

terval encodings based on associating with each node its number in pre- and post-order

traversals). The second group consists of shredding schemes based on schemas for XML

(e.g., variants of the inlining technique). Since in data-exchange scenarios we start with

two schemas, it seems more appropriate to apply schema-based techniques.

•

Target constraints . In relational data exchange, constraints in target schemas are required

to satisfy certain acyclicity conditions (weak acyclicity, to be precise); as we have seen,

without them, the chase procedure that constructs a target instance does not terminate.

Constraints imposed by general XML schema specifications need not in general be even

definable in relational calculus, let alone be acyclic. We thus need to find a shredding

Foundations of Data Exchange

Search WWH ::

Custom Search

Home