Database Reference
In-Depth Information
sequence of states of
in the children of v . By the pigeon-hole principle there exist nodes
v i , v j , i < j , that are assigned the same state in this run. If we remove from T the subtrees
rooted in v i , v i +1 ,..., v j 1 , the resulting tree is also accepted by
A
A
and the homomorphism
h can be naturally adjusted.
The number of (maximal) yellow paths is at most N red + N green . Hence there are at most
2
yellow nodes. Since all blue nodes are siblings of nodes of other colors, the
number of (maximal) blue sequences of siblings is at most 2( N red + N green + N yellow )
π ·S
4
π ·
(
S
+ 1) and so N blue
4
π ·
(
S
+ 1)
S
. Altogether we have at most
2 nodes.
2
π
(
S
+1)(2
S
+1)
12
π ·S
We now can prove the NP upper bound. Let
π
and
S
be the given pattern and schema.
By Lemma 11.9 ,if
π
is satisfiable, there exists a homomorphism into a tree conforming
to
, with a polynomial support. To decide satisfiability first guess a polynomial support
together with a partial run of the UNFTA underlying
S
(a consistent labeling with states)
and a homomorphism. Verifying the homomorphism takes time polynomial in the size of
the formula and the support (which is also polynomial in the size of the formula). Verifying
that the support is actually a restriction of a tree conforming to
S
requires a consistency
check which amounts to checking if a given word is in the language defined by a given
NFA (checking correctness of the guessed labeling), and checking if the language defined
by a given UNFTA is non-empty (providing subtrees to be rooted in the yellow nodes).
Both these checks can be done in polynomial time.
To get NP-hardness, we do a standard 3CNF SAT reduction. In fact, we only use
patterns from
S
= j =1 Z j
Z j
Z j with
Π
(
, ) without variables. Take a formula
ψ
Z j ∈{
x 1 , x 2 ,..., x n , x 1 , x 2 ,..., x n }
. Consider a DTD D (without attributes)
r
x 1 x 2 ···
x n
C j
C j
Z j = x i }|{
Z j = x i }
x i →{
1
i
n
over the alphabet
. In the second rule, interpret each set as a
concatenation of all its elements. (The order in which the elements are concatenated does
not matter.)
The labels C j are intended to correspond to Z j
{
x 1 , x 2 ,..., x n , C 1 , C 2 ,..., C k }
Z j . Each tree conforming to D
encodes a valuation of all variables x i : for each x i it stores either all conjuncts made true
by assigning 1 to x i , or all conjuncts made true by assigning 0 to x i .
The
Z j
satisfiability of
ψ
is
equivalent
to the
satisfiability of
the pattern
r [ / C 1 , / C 2 ,..., / C k ] with respect to D .
11.2 XML schema mappings and their complexity
XML schema mappings resemble relational mappings a lot. Like before, each mapping
consists of a source schema, a target schema, and a set of source-to-target tuple generating
Search WWH ::




Custom Search