Database Reference
In-Depth Information
This means that if the certain answer is to be processed further within the same frame-
work, based on tree patterns and TQL queries, our only concern should be its size.
What is then the complexity of finding a max-description, and how big can it be? In
particular, how big can the smallest of them, i.e., the core-description, be?
Let
|
T
|
be the number of nodes in T ,andlet
T
be
T ∈T |
T
|
. We show next that we
can compute a max-description of
T
in E XPTIME in general, and in P TIME if the number
of trees in
is fixed. Exercise 16.3 shows that even core descriptions can be exponential
in the number of trees in T .
T
Theorem 14.11
Let
T
=
{
T 1 ,..., T n
}
be a finite set of XML trees. A max-description of
is computable in time polynomial in T
n
n
T
.
Proof For simplicity assume that every node has a single attribute; extension to the gen-
eral case is straightforward.
The pattern we are going to construct is a sort of consistent product of T 1 ,..., T n .Pro-
ceed as follows. For the root, labeled with r , take the sequence (
),i.e.,these-
quence of roots of T 1 ,..., T n . Then iteratively, under every node ( v 1 ,..., v n ) put a new node
( w 1 ,..., w n ) for every sequence w 1 ,..., w n such that w i is a child of v i in T i ,andall w i 's are
labeled with the same letter
ε
,...,
ε
and put a fresh variable in the attribute
slot. If for some node v =( v 1 ,..., v n ) some v i is a leaf, v is a leaf as well.
Define A ( v 1 ,..., v n )=( a 1 ,..., a n ) where a i is the data value attached to the node v i .For
every node v such that A ( v )=( c ,..., c ) for a constant c , replace the variable in v with the
constant c . For the remaining nodes, whenever A ( v )= A ( w ), replace the variable in w with
the variable in v .
The constructed formula is clearly satisfied in every T i . Let us see that it is indeed a
max-description. Suppose that
σ
. Label the node
σ
π is satisfied in every T i .Let h i be a homomorphism from
π to T i . It is easy to see that h =( h 1 ,..., h n ) is a homomorphism from
π to
.
The complexity of the algorithm is polynomial in the size of the output pattern, which
is bounded by
π
T
n
n
n
i =1
(by the inequality between the arithmetic and the geo-
metric means for nonnegative numbers).
| T i |≤
14.3 Certain answers for TQL queries
When we try to compute certain answers to a query Q over a set
is usually
infinite. Hence, we need a finite representation of it. Recall also that finiteness guarantees
existence of max-descriptions. Such a finite representation comes in the form of a basis
T
of trees,
T
B
of
, which is a set of patterns. Recall that each pattern can be viewed as a tree, so when
we write
T
B ⊆T
, we mean that for every
π ∈B
, the tree associated to
π
is in
T
.
Definition 14.12 A basis for a set of trees
T
is a set of patterns
B ⊆T
such that each
pattern in
T
is a model of some pattern from
B
:
T ⊆
{
Mod(
π
)
| π ∈B}
.
Search WWH ::




Custom Search