Satisfiability of Simple XPath Fragments under 3/7/2016Fixed DTDs - Advances in Databases

Database Reference

In-Depth Information

lb [ i ]. Moreover, since D is fixed, it is clear that the algorithm runs in O ( m )time,

where m is the number of location steps in q .Thuswehavethefollowing.

Theorem 2. SAT(XP {↓,↓ ∗ ,→ ∗ ,← ∗ } ) is solvable in linear time under fixed DTDs.

5 Using Upward Axes under Fixed DTDs

We have shown that under fixed DTDs SAT(XP {↓,↓ ∗ ,→ + ,← + } ) is in PTIME but

SAT(XP {↓,↑} ) is NP-complete. In this section, we show sucient conditions un-

der which SAT(XP {↓,↓ ∗ ,↑,↑ ∗ ,→ + ,← + } ) is in PTIME under fixed DTDs.

Suppose first that a DTD D is fixed, nonrecursive, and no-star. To check

whether q ∈

XP {↓,↓ ∗ ,↑,↑ ∗ ,→ + ,← + } is satisfiable under D , it suces to construct

the set T of trees valid against D and check whether there is a tree t ∈ T on which

the result of q is nonempty. Since the size of T is fixed, SAT(XP {↓,↓ ∗ ,↑,↑ ∗ ,→ + ,← + } )

is clearly in PTIME under fixed, nonrecursive, no-star DTDs.

We next show a polynomial-time algorithm for deciding if a query q ∈

XP {↓,↓ ∗ ,↑,↑ ∗ ,→ + ,← + } is satisfiable under a fixed nonrecursive DTD D .Inthis

case, there may be infinite trees valid against D since D may contain '

'opera-

tors. Thus we have to construct a set T of valid trees carefully so that T is finite

and contains valid trees enough to check the satisfiability of q under D .

Let D =( d, s ) be a fixed nonrecursive DTD and q ∈

∗

XP {↓,↓ ∗ ,↑,↑ ∗ ,→ + ,← + }

be a query. In short, our algorithm decides whether q is satisfiable under D ,as

follows.

1. For a label a and an extracted regular expression r ∈ E ( d ( a )), we use a

set T ( a, r ) of trees valid against DTD ( d, a ) such that T ( a, r ) is finite but

covers valid trees enough to check the satisfiability of q under ( d, a ). For any

t ∈ T ( a, r ), each node n of t has a type denoted type ( n ), where type ( n )

∈

E ( d ( l ( n ))) (in particular, type ( n )= r if n is the root of t ), and the children

of n match type ( n ). type ( n ) is used to handle

+ axes. We assume

that every tree in T ( a, r ) is unordered; the order among sibling nodes of a

node n is managed by the Glushkov automaton of type ( n ).

+ and

→

←

2. Let T ( s )= r∈E ( d ( s )) T ( s, r ). If there is a tree t ∈ T ( s ) on which the result

of q is nonempty, then the algorithm returns “yes” (i.e., q is satisfiable),

otherwise it returns “no”.

Example 1. Let D =( d, s ) be a DTD, where d ( s )= a ( b|c ), d ( b )= e|f , d ( a )=

d ( c )= d ( e )= d ( f )= .Wehave E ( d ( s )) =

∪

T ( s, ac ). T ( s ) consists of three trees (Fig. 3). In this case, T ( s )coversallthe

trees valid against D . As defined later, the label of each node is superscripted.

{ab, ac}

,thus T ( s )= T ( s, ab )

To define T ( a, r ) formally, we need some definitions. If a superscripted label a i

is a descendant of a '

' operator in the tree representation of r # ,thenwesay

that a i is starred .By sym ∗ ( r # ), we mean the set of superscripted starred labels

in r # . For example, if r # = a 3 b 4 ( c 7

∗

|b 8 ) ∗ ,then sym ∗ ( r # )=

{c 7 ,b 8

.Foranode

n and trees t 1 , ··· ,t m ,by n ( t 1 , ··· ,t m ) we mean the tree rooted at n with m

}

Advances in Databases

Search WWH ::

Custom Search

Home