Database Reference
In-Depth Information
Proposition 14.3 Fo r e a c h TQL query Q and a tree T , computing Q ( T ) can be done in
time polynomial in the size of T .
As we have learned, incompleteness of information cannot be avoided in data ex-
change: the database we query is only partly defined via source data and mappings between
schemas. Semantically, an incomplete description of a database is a set
of completely
described databases that it can represent. If we have a query Q returning sets of tuples,
then certain answers are obtained by taking tuples that belong to the result no matter which
complete database in
D
)= {
. But our queries
return trees, rather than sets of tuples. What is a natural analog of certain answers then?
In general, we want to define certain ( Q ,
D
is used: that is, certain ( Q ,
D
Q ( D )
|
D
∈D}
T
) - the certain answers to a query Q returning
trees over a set
of XML trees. This set of trees could be, for instance, a set of solutions
(target instances) in data exchange. For this we need an analog of the intersection operator
applied to Q (
T
T
)=
{
Q ( T )
|
T
∈T }
. What could it be? The first idea is to take the maximal
tree contained in all of trees in Q (
). But this may fail as such a tree need not be unique: if
T 1 , T 2 ,and T 3 shown in Figure 14.3 are in Q (
T
T
), there are three maximal trees subsumed
by them shown in part (d) of the figure.
r
r
r
r
r
r
a
a
a
a
a
a
a
a
a
c
c
c
c
b
d
b
d
d
b
b
d
(a) T 1
(b) T 2
(c) T 3
(d) Maximal trees
Figure 14.3 Why the “maximal subsumed tree” approach fails
One might be tempted to combine these trees in (d) into one, but this is not satisfactory
either. We have a choice of merging the roots, or the a -nodes, but either way the resulting
tree is not subsumed by T 1 , T 2 and T 3 ; in what sense, then, is it a certain answer? And so
far we have not considered data values .Whatifin T 1 , T 2 ,and T 3 ,theleft a carries value
“1” and the right a carries “2”? This is the situation depicted in Figure 14.4 on page 198 .
Then in fact, none of the three trees shown in (d) above is among the subsumed trees, and
we lose the certain knowledge about the grandchildren of the root.
Thus, even in a very simple example, it is not completely clear what the natural notion of
certain answers is. In the next section we develop a theory of certain answers for XML-to-
XML queries, based on the concepts of theory and definition borrowed from mathematical
logic.
14.2 Notion of certain answers
The idea of certain answers is to extract the maximum knowledge from answers Q ( D ),as
D ranges over some collection of databases
. What it means precisely, depends on the
logical language we use for expressing the knowledge. For instance, suppose we use the
D
 
Search WWH ::




Custom Search