Mapping composition - Foundations of Data Exchange

Database Reference

In-Depth Information

For relational mappings we shall follow the second approach. We shall develop a natural

extension of st-tgds that has precisely the expressive power needed to specify compo-

sitions of st-tgd mappings. In contrast, for XML mappings we shall see that any such

formalism would be undecidable, and we shall switch to the first approach, obtaining a

restricted class of mappings closed under composition.

Since our search for the right formalism is guided by the complexity analysis of the com-

position problem, we need to define it properly. The complexity of the composition of map-

pings

M 12 =(

S 1 ,

S 2 ,

M 23 =(

S 2 ,

S 3 ,

Σ 12 ),

Σ 23 ), between relational or XML schemas,

is the complexity of the following problem:

P ROBLEM : C OMPOSITION (

M 12 ,

M 23 )

I NPUT :

Instance T 1 of

S 1 and T 3 of

S 3

Q UESTION :

( T 1 , T 3 )

∈ M 12 ◦ M 23

In the following section we shall see how knowing the complexity of composition can

help in finding a formalism for composing relational mappings defined by st-tgds.

19.2 Complexity of relational composition

Example 19.2 shows that in order to express the composition of mappings specified by

st-tgds, one has to use a language more expressive than st-tgds. However, the example

gives little information about what the right language for composition is. In fact, the com-

position of mappings

M 12 and

M 23 in this example can be defined in first-order logic (see

Exercise 22.3 ):

∀

∃

∀

c ( Takes ( n , c )

→ Enrolled ( y , c )) ,

which may lead to the conclusion that FO is a good alternative to define the compo-

sition of mappings specified by st-tgds. However, a complexity argument shows that

this conclusion is incorrect. To show this, we first need to know the complexity of

C OMPOSITION (

M 12 ,

M 23 ). The following theorem shows that it is always in NP, and

can be NP-hard.

Theorem 19.3 (Complexity of composition)

For every pair of mappings

M 12 ,

M 23 spec-

ified by st-tgds, C OMPOSITION (

M 12 ,

M 23 ) is in NP . Moreover, there exist mappings

23 specified by st-tgds such that C OMPOSITION (

12 ,

23 ) is NP -complete.

and

Proof The membership of C OMPOSITION (

M 23 ) in NP can be proved by showing

that there exists a polynomial p (that depends on

M 12 ,

M 12 and

M 23 ) such that if ( S 1 , S 3 )

∈

M 12 ◦M 23 , then there exists an instance S 2 satisfying that ( S 1 , S 2 )

∈M 12 , ( S 2 , S 3 )

∈M 23

and

). We leave this proof for the reader (see Exercise 22.3 ), and we

focus here on showing that C OMPOSITION (

S 2 ≤

p (

S 1

S 3

M 23 ) can be NP-hard.

Let R 1 be a schema consisting of a unary relation node and a binary relation edge , R 2

M 12 ,

Foundations of Data Exchange

Search WWH ::

Custom Search

Home