Database Reference
In-Depth Information
For each forest
expression q
(
x
)
, and each valuation v of x into values
used in S, there
exists a homomorphism h
:
q
(
x
)
S
,
v
→
q
(
x
)
T
,
h
◦
v
coinciding with h on
V
AR
.
We proceed by induction on
q
(
x
).If
q
(
x
)=
ε
, the claim follows trivially.
If
q
(
x
)=
(
a
,
x
)[
q
(
x
)],then
S
,
v
=
(
a
,
v
(
x
))[
q
(
x
)
q
(
x
)
S
,
v
] and
q
(
x
)
T
,
h
◦
v
=
(
h
◦
v
(
x
))[
q
(
x
)
T
,
h
◦
v
], and the claim follows from the induction hypothesis.
Suppose that
q
(
x
)=
q
(
x
)
,
q
(
x
). By the induction hypothesis there exist homomor-
phisms
h
:
q
(
x
)
q
(
x
)
T
,
h
◦
v
and
h
:
q
(
x
)
q
(
x
)
S
,
v
→
S
,
v
→
T
,
h
◦
v
coinciding with
T
,
h
◦
v
as
h
h
on V
AR
. Pick a tree
U
∈
q
(
x
)
S
,
v
.Define
h
U
:
U
→
q
(
x
)
U
if
U
comes
S
,
v
.Let
h
be the union of
h
U
over
q
(
x
)
S
,
v
,andas
h
q
(
x
)
from
U
if
U
comes from
S
,
v
. Since all
h
U
coincide with
h
on V
AR
,
h
is defined consistently. It is a
homomorphism, because its restriction to every tree in the domain is a homomorphism.
Finally, let us assume that
q
(
x
)=
for
all
U
∈
q
(
x
)
(
a
,
x
,
y
)
return
q
(
x
,
y
).Then
π
S
,
v
=
q
S
,
v
S
x
=
v
,
(
a
,
v
(
x
,
y
))
,
v
q
(
x
)
|
=
π
q
T
,
v
T
T
,
h
◦
v
=
v
.
(
a
,
v
(
x
,
y
))
,
v
q
(
x
)
|
=
π
x
=
h
◦
S
,
v
.Let
v
be the valuation extending
v
and satisfying
Pick a tree
U
from
q
(
x
)
(
a
,
v
(
x
,
y
)), such that
S
comes from
q
S
,
v
.Since
h
:
S
S
|
=
π
→
T
, it holds that
T
|
=
v
(
x
,
y
)). Moreover,
h
v
q
T
,
h
◦
v
π
(
h
◦
◦
x
=
h
◦
v
. Consequently,
is a subforest of
q
S
,
v
→
q
T
,
h
◦
v
,
q
(
x
)
T
,
v
. By the induction hypothesis there is a homomorphism
h
v
:
U
.Let
h
be the
→
T
,
h
◦
v
be defined as
h
v
coinciding with
h
on V
AR
.Let
h
U
:
U
q
(
x
)
union of
h
U
for all
U
∈
q
(
x
)
S
,
v
.Asall
h
U
are homomorphisms coinciding with
h
on
V
AR
,sois
h
.
14.4 XML-to-XML queries in data exchange
We have shown that
TQL
queries preserve bases, and thus
Algorithm 14.1
computes a
certain answer if we can find a small finite basis. Now we would like to apply this in the
data exchange context. A
TQL
query
Q
then is posed over the target schema, and we must
find a certain answer to it over the set of data exchange solutions, i.e., over S
OL
M
(
T
),
for some source tree
T
and a mapping
M
. For that, we need to find a finite basis
B
of
S
OL
M
(
T
), and then just compute a max-description of
Q
(
B
). We know how to do the
former, but how do we find bases of sets of solutions?
In general, the existence (and properties) of bases of S
OL
M
(
T
) depend on the properties
of mappings. Our first result shows that we can compute a finite basis
B
for S
OL
M
(
T
)
whenever
only uses vertical navigation in its pattern, and nested-relational DTDs in the
target, i.e., if it is a mapping from SM
nr
(
M
,
=). Essentially, a basis is what we get when we
try to compute a universal solution but cannot. The method for computing the basis will be
a modification of the algorithm described in
Section 13.4
.
⇓
SM
nr
(
Theorem14.16
Given a mapping
M ∈
⇓
,
=)
and a source tree T , one can compute
a basis
B
for
S
OL
M
(
T
)
in time single-exponential in the size of T . The size of
B
can be
exponential, but the size of each pattern
π
∈B
is polynomial in T .