Database Reference
In-Depth Information
inclusion-exclusion formula: all three queries Q , Q , and Q
Q are inversion free, and the
claim follows by induction. If, on the other hand, Q = Q 1 Q 2 ... and each Q i
is connected,
then it must have a root variable x i . Write Q
...) : clearly z is a root
variable, and it is also a separator variable because for any two unifiable atoms L 1 ,L 2 , z occurs on
the same position π L 1 ( 1 )
=∃
z.(Q 1 [
z/x 1 ]∨
Q 2 [
z/x 2 ]∨
π L 2 ( 1 ) . Thus, inversion free queries are R 6 -safe queries and, therefore,
in UCQ(P ) . The following proposition strengthen this observation:
=
Proposition 5.19 The following holds: UCQ(OBDD)
=
UCQ(OBDD,1)
=
“inversion free queries”.
The proposition says two things. On one hand, every inversion-free query admits an OBDD
whose size is linear in the size of the database: in fact, we will show that its width is 2 k
= O( 1 ) ,
where k is the total number of atoms in the hierarchical, inversion free expression of Q . Thus, the
width of the OBDD depends only on the query, not on the database instance, and therefore the size
of the OBDD is linear, O(n) , meaning that Q
UCQ(OBDD,1) . On the other hand, if a query
is not inversion-free, then the size of the smallest OBDD grows exponentially in the size of the
database. For example, the proposition implies that the lineage of Q J ( Example 4.7 ) has an OBDD
whose size is linear in that of the input database, while the lineage of Q V ( Example 4.8 ) does not
have polynomial-size OBDDs.
We give here the main intuition behind the positive result, namely that an inversion free
query has an OBDD of width 2 k . We illustrate with the example Q J : the reader can derive the
general case from this example. Start by writing Q J
= Q 1 Q 2 , where Q 1 = R(x 1 ), S(x 1 ,y 1 )
and Q 2 =
T(x 2 ), S(x 2 ,y 2 ) . Fix a database instance D . Then, the lineage Q 1 is read-once, and
therefore it admits an OBDD of width 1. For example, on a database with with four tuples,
R( 1 ), R( 2 ) and S( 1 , 1 ), S( 1 , 2 ), S( 2 , 3 ), S( 2 , 4 ) , its lineage is X 1 Y 1
X 2 Y 4 , and
its OBDD is shown in Figure 5.2 . In general, the OBDD examines the tuples S(i,j) in row-
major order; that is, for some arbitrary data instance D , the variable order of the OBDD is
R( 1 ), S( 1 , 1 ), S( 1 , 2 ),...,R( 2 ), S( 2 , 1 ), S( 2 , 2 ),... Next, we transform the OBDD such that
every path examines all the variables. For that, we must insert on the 0-edge from R( 1 )
dummy nodes S( 1 , 1 ), S( 1 , 2 ),... , and we must insert in the 0-edge from R( 2 ) the dummy
nodes S( 2 , 1 ), S( 2 , 2 ),... Thus, we have a “complete” OBDD for Q 1 of width 2. Similarly,
we obtain a complete OBDD for Q 2 of width 2, which reads the Boolean variables in the
same order: T( 1 ), S( 1 , 1 ), S( 1 , 2 ),...,T( 2 ), S( 2 , 1 ), S( 2 , 2 ),... We insert the missing variables
T( 1 ), T ( 2 ),... in the first OBDD, and the missing variables R( 1 ), R( 2 ),... in the second OBDD,
without increasing the width. Now, we can synthesize an OBDD for Q = Q 1 Q 2 of width
4, by using the property on OBDD synthesis mentioned above (see also [ Wegener , 2004 ]). Thus,
Q J admits an OBDD of size 4 n , where n is the total number of tuples in the database.
In general, we can synthesize the OBDD inductively on the structure of the query Q , provided
that the subqueries Q 1 ,Q 2 used the same variable order: this is possible for inversion-free queries be-
cause the variable order of the tuples in a relation R(x 1 ,x 2 ,...) is the lexicographic order, determined
by the attribute-order π R ( 1 ), π R ( 2 ),... If the query has an inversion, then the synthesis is no longer
X 1 Y 2
X 2 Y 3
Search WWH ::




Custom Search