Database Reference
In-Depth Information
least we need to be able to determine whether a solution exists at all. This is the problem
of existence of solutions.
PROBLEM
:
OL
E
XISTENCE
M
.
INPUT:
A source instance
S
.
QUESTION
:
Is there a solution for
S
under
M
?
We study the existence of solutions problem in
Chapter 5
.Weshowin
Section 5.2
that the problem is undecidable in general, but that it becomes decidable - and, actually,
tractable - for a relevant class of mappings: those with a
weakly acyclic
set of tgds, as
defined in
Section 5.4
. In order to prove this, we show, in
Section 5.3
, how to construct
solutions using the well-known
chase
procedure.
Notice that in the definition of the problem of existence of solutions the mapping
is
assumed to be
fixed
- that is, the complexity is measured in terms of the size of the source
instance. This corresponds to the
data complexity
of the problem. In
Section 5.5
we also
study its combined complexity; that is, the complexity when both the source instance and
the mapping are given as input.
M
Materializing target instances
Once we know that solutions exist, we need to construct one. The problem, explained in
Chapter 1
, is that usually there is more than just one solution; in fact, as we just saw,
typically there are infinitely many of them. Thus, we need compute one that reflects ac-
curately the source data and does not contain too much redundant information. Intuitively,
this means that we want to compute solutions that are more
general
than any other solution.
The following example illustrates the notion of a solution being more general than others.
Example 4.3 Again we refer to the setting of
Example 4.1
, and three possible solutions
T
,
T
,and
T
, shown on page
39
. Intuitively, it appears that
T
and
T
are less general
than
T
. This is because
T
assumes that the values that witness the existentially quantified
variables
f#
and
arr
, in the first st-tgd of
st
, are the same, while
T
assumes that these
variables are witnessed by the constants
AF
406 and 920, respectively. On the other hand,
solution
T
contains exactly what the specification requires. Thus, it seems natural to say
that one would like to materialize a solution like
T
rather than solution
T
or
T
,as
T
is
more accurate with respect to
S
(under
Σ
)than
T
and
T
are.
M
More general solutions in data exchange are identified with
universal
solutions. We will
see in the following chapters that universal solutions have a number of good properties in
terms of representing other solutions, as well as answers to queries. In general, however, the
existence of solutions does not imply the existence of universal solutions, and the problem
of existence of universal solutions is undecidable. We show these results in
Section 6.2
,but
later, in
Section 6.3
, we show that for the same class of mappings for which the existence of
solutions becomes tractable (that is, those with a weakly acyclic set of tgds), the unpleasant
problems just mentioned disappear. For this class of mappings, the existence of solutions