Database Reference
In-Depth Information
lb
[
i
]. Moreover, since
D
is fixed, it is clear that the algorithm runs in
O
(
m
)time,
where
m
is the number of location steps in
q
.Thuswehavethefollowing.
Theorem 2.
SAT(XP
{↓,↓
∗
,→
∗
,←
∗
}
)
is solvable in linear time under fixed DTDs.
5 Using Upward Axes under Fixed DTDs
We have shown that under fixed DTDs SAT(XP
{↓,↓
∗
,→
+
,←
+
}
) is in PTIME but
SAT(XP
{↓,↑}
) is NP-complete. In this section, we show sucient conditions un-
der which SAT(XP
{↓,↓
∗
,↑,↑
∗
,→
+
,←
+
}
) is in PTIME under fixed DTDs.
Suppose first that a DTD
D
is fixed, nonrecursive, and no-star. To check
whether
q ∈
XP
{↓,↓
∗
,↑,↑
∗
,→
+
,←
+
}
is satisfiable under
D
, it suces to construct
the set
T
of trees valid against
D
and check whether there is a tree
t ∈ T
on which
the result of
q
is nonempty. Since the size of
T
is fixed, SAT(XP
{↓,↓
∗
,↑,↑
∗
,→
+
,←
+
}
)
is clearly in PTIME under fixed, nonrecursive, no-star DTDs.
We next show a polynomial-time algorithm for deciding if a query
q ∈
XP
{↓,↓
∗
,↑,↑
∗
,→
+
,←
+
}
is satisfiable under a fixed nonrecursive DTD
D
.Inthis
case, there may be infinite trees valid against
D
since
D
may contain '
'opera-
tors. Thus we have to construct a set
T
of valid trees carefully so that
T
is finite
and contains valid trees enough to check the satisfiability of
q
under
D
.
Let
D
=(
d, s
) be a fixed nonrecursive DTD and
q ∈
∗
XP
{↓,↓
∗
,↑,↑
∗
,→
+
,←
+
}
be a query. In short, our algorithm decides whether
q
is satisfiable under
D
,as
follows.
1. For a label
a
and an extracted regular expression
r ∈ E
(
d
(
a
)), we use a
set
T
(
a, r
) of trees valid against DTD (
d, a
) such that
T
(
a, r
) is finite but
covers valid trees enough to check the satisfiability of
q
under (
d, a
). For any
t ∈ T
(
a, r
), each node
n
of
t
has a
type
denoted
type
(
n
), where
type
(
n
)
∈
E
(
d
(
l
(
n
))) (in particular,
type
(
n
)=
r
if
n
is the root of
t
), and the children
of
n
match
type
(
n
).
type
(
n
) is used to handle
+
axes. We assume
that every tree in
T
(
a, r
) is unordered; the order among sibling nodes of a
node
n
is managed by the Glushkov automaton of
type
(
n
).
+
and
→
←
2. Let
T
(
s
)=
r∈E
(
d
(
s
))
T
(
s, r
). If there is a tree
t ∈ T
(
s
) on which the result
of
q
is nonempty, then the algorithm returns “yes” (i.e.,
q
is satisfiable),
otherwise it returns “no”.
Example 1.
Let
D
=(
d, s
) be a DTD, where
d
(
s
)=
a
(
b|c
),
d
(
b
)=
e|f
,
d
(
a
)=
d
(
c
)=
d
(
e
)=
d
(
f
)=
.Wehave
E
(
d
(
s
)) =
∪
T
(
s, ac
).
T
(
s
) consists of three trees (Fig. 3). In this case,
T
(
s
)coversallthe
trees valid against
D
. As defined later, the label of each node is superscripted.
{ab, ac}
,thus
T
(
s
)=
T
(
s, ab
)
To define
T
(
a, r
) formally, we need some definitions. If a superscripted label
a
i
is a descendant of a '
' operator in the tree representation of
r
#
,thenwesay
that
a
i
is
starred
.By
sym
∗
(
r
#
), we mean the set of superscripted starred labels
in
r
#
. For example, if
r
#
=
a
3
b
4
(
c
7
∗
|b
8
)
∗
,then
sym
∗
(
r
#
)=
{c
7
,b
8
.Foranode
n
and trees
t
1
, ··· ,t
m
,by
n
(
t
1
, ··· ,t
m
) we mean the tree rooted at
n
with
m
}