Building solutions - Foundations of Data Exchange

Database Reference

In-Depth Information

two sections we show how to overcome them by refining the notion of kind, essentially

without modifying the main algorithm. In Section 12.4 we deal with regular schemas under

an additional assumption that the child axis is forbidden (under regular schemas it also

causes problems), and in Section 12.5 we extend our methods to cover the child axis and

the sibling order.

12.4 The algorithm for regular schemas

In the previous section we showed the algorithm for mappings in SM nr (

), i.e., those

with nested-relational DTDs and downward navigation. We also indicated the difficulties

posed by extending the algorithm to both regular schemas and horizontal navigation. In this

section we handle the first problem and develop solution-building techniques for regular

schemas (while still staying with downward navigation, in fact, for now just the descendant

relation).

For nested-relational DTDs, building a solution required manipulating contexts and

forests in a way consistent with the DTD. For regular schemas it is most convenient to de-

scribe contexts and forests in terms of existing runs of the UNFTA underlying the schema.

⇓

∼

Definition 12.16 Let p , p , p be states of an UNFTA

,andlet q , q be states of one

's horizontal automata. A tree T is a p-tree if it admits a run of

that assigns

p to the root. A context C is a p , p -context if it admits a run of

which assigns the

)and p to the

root. More generally, a multicontext M is a p 1 , p 2 ,..., p n -multicontext if it has n

state p to the port of C (formally, we add a transition

( p ,

◦

−

1 ports

u 1 , u 2 ,..., u n − 1 , each of them a leaf port, and it admits a run of

which assigns p i to u i

1, and p n to the root. A forest F = S 1 S 2 ... S n is a q , q -forest if and

only if there are states p 1 , p 2 ,..., p m of

for i = 1 , 2 ,..., n

−

such that S i is a p i -tree and there is a run of the

corresponding horizontal automaton over p 1 p 2 ... p m starting in q and ending in q .

Note that for the initial state q I ,thesetof q I -trees is exactly L (

).Moreover,if T is a

p -tree, and C is a p , p -context, then C

T is a p -tree. Indeed, the run over C

T is obtained

by substituting the run over T in the run over C . Similarly, for a p , p -context D , D · C is a

p , p -context, and for a q , q -forest F and a q , q -forest G , F + G is a q , q -forest. If M is

a p 1 , p 2 ,..., p n -multicontext and T i is a p i -tree, M ( T 1 , T 2 ,..., T n − 1 ) is a p n -tree.

It will be convenient to assume that all states of A are productive , in the sense that for

each state p there exists a p -tree, and that all states of the horizontal automata are reachable.

This can be guaranteed by polynomial preprocessing.

As we learned in Section 12.3 , building solutions requires distinguishing the areas of the

schema that are unique from those that can be repeated, or pumped . For regular schemas

the areas that can be pumped are associated with strongly connected components of the

underlying graphs. An NFA

=( Q ,

, q 0 ,

, F ) can be seen as a graph G B =( Q , E ),

where ( p , q )

∈

E if ( p ,

, q )

∈ δ

for some

σ ∈ Γ

. With an UNFTA

=( Q ,

, F ) we

can also associate a graph: let G A =( Q , E ),where( p , p )

E iff Q ∗ pQ ∗ ∩ δ

( p ,

∈

) is

Foundations of Data Exchange

Search WWH ::

Custom Search

Home