Database Reference
In-Depth Information
igation. The algorithm could not, however, be straightforwardly extended to handle reg-
ular schemas and horizontal navigation.
Then, in Section 12.4 , we showed how to incorporate regular schemas, extending the
polynomial-time algorithm to handle mappings in SM(
+ , ,
).
In this section we complete the journey by providing a polynomial-time solution-
building algorithm for arbitrary mappings: we refine our techniques to cover mappings
using
+ . For such patterns Lemma 12.19 no longer holds for arbitrary kinds;
for instance, the schema r
,
and
,
where the port admits any forest that is a sequence of b -nodes with no children. For each
d = 1 , 2 ,..., n there is a T d in L ( K ) that satisfies
ab ,where b has a single attribute, and a kind K = r a ,
( d )= r a
b ( d ) , but there is no single
π
tree in L ( K ) satisfying
( d ) for all d .
What is wrong with our definition of kind? Recall that the intuition behind kinds is to
specify completely the areas of the tree that are unique and cannot be pumped. The reason
why there is no way to combine the trees T d above is that values in the leftmost b -node are
different, and they are different because the kind does not specify this data value. But this
violates the idea behind the kinds: in the presence of sibling order, the leftmost b -node is
clearly a unique node!
Thus for expressive patterns we cannot simply rely on SCCs of the UNFTA underlying
the schema, when distinguishing between the unique and pumpable node; we also need to
think of nodes that can be uniquely selected by patterns. This leads to further difficulties.
In a tree conforming to the schema r
π
ab ,every b -node can be specified uniquely with a
pattern of the form r a
b ( x ) . Does this mean that we can give no bound
on the size of schemas? The answer is that we can, but the bound needs to depend on the
patterns used in the mapping. Intuitively, a pattern with n nodes can only specify uniquely
n
b
b
→···→
1leftmost b -nodes. To guarantee an analog of Lemma 12.19 , the kinds need to specify
large enough areas around each port (the phenomenon described above takes place also for
context ports and tree ports, and the child axis). We formalize this intuition by the notion
of a margin.
Definition 12.22 For m N ,arunλ of an UNFTA over a kind K has margins of size
m if there exist witnessing runs of the horizontal automata such that for each port u ,the
following hold:
if u is a tree port, there is a
-path v m , v m +1 ,..., v 0 where
λ
stays in the same SCC
and the only port on the path is v 0 = u ;
-path v m , v m +1 ,..., v m where
if u is a context port, there is a
λ
stays in the same SCC
and the only port on the path is v 0 = u ;
if u is a q , q -port with q , q in an SCC X of some horizontal automaton, there is a se-
quence of consecutive siblings v m , v m +1 ,..., v m where the horizontal run stays in the
same SCC and the only port in the sequence is v 0 = u .
Before we show how to combine trees, let us observe that the set of target instances can
be covered by small kinds with margins of given length.
Search WWH ::




Custom Search