σ ( M )
σ ( Q )
( S )
( T )
Figure 15.1 XML data exchange via relations: desiderata
In a native system, we would materialize some solution T over which we could answer
queries Q .
But now we want a relational system to do the job. So we shred
) and then
) to get a solution - which itself is a
shredding of an XML solution - so that the answer to Q could be reconstructed from the
result of the query
) the translation of the mapping
( Q ) over that relational solution.
The idea seems simple and natural on the surface, but starts looking challenging once we
look deeper into it. Before even attempting to show that the relational translation faithfully
represents the XML data-exchange problem, we need to address the following.
Complexity mismatch . Without restrictions, there cannot be a faithful representation of
XML data exchange by a relational system. Indeed, we have seen in PART TWO that
unions of conjunctive queries can be efficiently evaluated in relational data exchange, by
means of constructing the canonical universal solution, and applying naıve evaluation.
At the same time, we saw in Chapter 13 that answering analogs of unions of conjunctive
queries in XML data exchange can be CO NP-complete. So any claim that a relational
data-exchange system correctly performs XML data exchange for arbitrary documents
and queries is bound to be wrong. We thus need to identify the cases that can be handled
by a relational system.
Which shredding scheme to use? There are several, which can roughly be divided into
two groups. The first group consists of shredding schemes that do not take the schema
information into account (for instance, representing edge relations of the trees, or in-
terval encodings based on associating with each node its number in pre- and post-order
traversals). The second group consists of shredding schemes based on schemas for XML
(e.g., variants of the inlining technique). Since in data-exchange scenarios we start with
two schemas, it seems more appropriate to apply schema-based techniques.
Target constraints . In relational data exchange, constraints in target schemas are required
to satisfy certain acyclicity conditions (weak acyclicity, to be precise); as we have seen,
without them, the chase procedure that constructs a target instance does not terminate.
Constraints imposed by general XML schema specifications need not in general be even
definable in relational calculus, let alone be acyclic. We thus need to find a shredding