Inverting schema mappings - Foundations of Data Exchange

Database Reference

In-Depth Information

have that ( S 3 , T 1 )

and, therefore T 1 ∈

S OL M ( S 3 ). Thus, from the fact that T 1 is a

witness for S 1 under

S OL M ( S 3 ). But this leads to

a contradiction since the instance of R 2 consisting of fact E ( n ),where n

, we conclude that S OL M ( S 1 )

⊆

= n is a null

value, is a solution for S 1 under

(since null

values n , n are treated as constants). This concludes the proof of the proposition.

, but it is not a solution for S 3 under

To distinguish handling nulls from handling constants, we shall consider closing mappings

under homomorphisms. The intuition behind this approach is that nulls are intended to

represent unknown data and, thus, it should be possible to replace them by arbitrary val-

ues. Formally, given a mapping

from a schema R 1 to a schema R 2 ,define e (

),the

homomorphic extension of

, as the mapping:

there exists ( S , T )

{

( S , T )

∈M

and

homomorphisms from S to S and from T to T

}

Thus, for a mapping

that has nulls in source and target instances, one does not have to

consider

) as the mapping to deal with when exchanging data, and computing

mapping operators, since e ( M ) treats nulls in a meaningful way.

but e (

→

Example 20.41 Assume that

B ( x ),andlet S 1

be a source instance consisting of fact A ( n ),where n is a null value. What should be the

solutions for S 1 under

is a mapping specified by tgd A ( x )

? Given that null values represent missing information, from the

fact that A ( n ) holds in S 1 , we can infer that there exists a tuple in table A in this instance,

but we cannot infer what this value is. Thus, if T is a solution for S 1 under

, then one

should be able to infer from the information in T that there exists a tuple in table B .In

particular, if T is an instance of R 2 containing a fact B ( a ),where a is either a constant or

a null value, then T should be a solution for S 1 under

. This intuition is correctly taken

into consideration in the definition of e (

),aswehavethat:

T is an instance of R 2 and B T

S OL e ( M ) ( S 1 )=

{

= 0

}

Now let S 2 be a source instance consisting of facts A ( n 1 ) and A ( n 2 ),where n 1 and n 2 are

distinct null values. What should be the solutions for S 2 under

? Given that one cannot

infer from the name of a null value any information about the constant value represented by

it, one cannot infer that two distinct null values represent two distinct constants. Thus, since

S 2 is given by facts A ( n 1 ) and A ( n 2 ), we only know that there exists a tuple in table A in this

instance and, thus, it should be the case that S OL e ( M ) ( S 2 )=S OL e ( M ) ( S 1 ). The intuition

behind the definition of e (

) takes this into consideration, as the space of solutions for

) is constructed by considering an instance S of R 1 such that

there exists a homomorphism from S to S . In fact, if two instances are homomorphically

equivalent, like S 1 and S 2 , then they have the same space of solutions under e (

an instance S under e (

The following theorem shows that with this new semantics, one can avoid anomalies such

as the one shown in Proposition 20.40 . In fact, the theorem shows that maximum recoveries

Foundations of Data Exchange

Search WWH ::

Custom Search

Home