The problem of relational data exchange - Foundations of Data Exchange

Database Reference

In-Depth Information

least we need to be able to determine whether a solution exists at all. This is the problem

of existence of solutions.

PROBLEM : OL E XISTENCE M .

INPUT:

A source instance S .

QUESTION :

Is there a solution for S under

M

?

We study the existence of solutions problem in Chapter 5 .Weshowin Section 5.2

that the problem is undecidable in general, but that it becomes decidable - and, actually,

tractable - for a relevant class of mappings: those with a weakly acyclic set of tgds, as

defined in Section 5.4 . In order to prove this, we show, in Section 5.3 , how to construct

solutions using the well-known chase procedure.

Notice that in the definition of the problem of existence of solutions the mapping

is

assumed to be fixed - that is, the complexity is measured in terms of the size of the source

instance. This corresponds to the data complexity of the problem. In Section 5.5 we also

study its combined complexity; that is, the complexity when both the source instance and

the mapping are given as input.

M

Materializing target instances

Once we know that solutions exist, we need to construct one. The problem, explained in

Chapter 1 , is that usually there is more than just one solution; in fact, as we just saw,

typically there are infinitely many of them. Thus, we need compute one that reflects ac-

curately the source data and does not contain too much redundant information. Intuitively,

this means that we want to compute solutions that are more general than any other solution.

The following example illustrates the notion of a solution being more general than others.

Example 4.3 Again we refer to the setting of Example 4.1 , and three possible solutions

T , T ,and T , shown on page 39 . Intuitively, it appears that T and T are less general

than T . This is because T assumes that the values that witness the existentially quantified

variables f# and arr , in the first st-tgd of

st , are the same, while T assumes that these

variables are witnessed by the constants AF 406 and 920, respectively. On the other hand,

solution T contains exactly what the specification requires. Thus, it seems natural to say

that one would like to materialize a solution like T rather than solution T or T ,as T is

more accurate with respect to S (under

Σ

)than T and T are.

M

More general solutions in data exchange are identified with universal solutions. We will

see in the following chapters that universal solutions have a number of good properties in

terms of representing other solutions, as well as answers to queries. In general, however, the

existence of solutions does not imply the existence of universal solutions, and the problem

of existence of universal solutions is undecidable. We show these results in Section 6.2 ,but

later, in Section 6.3 , we show that for the same class of mappings for which the existence of

solutions becomes tractable (that is, those with a weakly acyclic set of tgds), the unpleasant

problems just mentioned disappear. For this class of mappings, the existence of solutions

Foundations of Data Exchange

Search WWH ::

Custom Search

Home