Database Reference
In-Depth Information
time. Indeed, it suffices to check whether there is a homomorphism h : T
T such that
the size of the image of T under h is strictly less than the size of T . Then we replace T by
h ( T ), and iteratively continue this process until reaching a fixed-point.
The blocks method was also extended to the case when
Σ t consists of a set of egds. There
is an extra difficulty in this case. The property mentioned above, the bounded block-size
in the Gaifman graph of the canonical universal solution T of a source instance S ,isno
longer true in the presence of egds. This is because the chase, when applied to egds, can
equate nulls from different blocks, and thus, create blocks of nulls of larger and larger size.
This problem is solved by a surprising rigidity lemma stating the following. Let T be the
canonical universal solution for a source instance with respect to a set of st-tgds
Σ st ,and
let T be the target instance obtained from T by chasing with the egds in
Σ t .Theniftwo
in T , then the
nulls
1 and
2 in different blocks of T are replaced by the same null
is rigid .Thatis,if h : T
T is a homomorphism then h (
, and thus, T has
null
)=
the bounded block-size property if we treat those nulls as constants.
The situation is much more complex in the presence of tgds. This is because the canon-
ical universal solution T for a source instance S does not have the bounded block-size
property, and in addition, it is no longer true that equated nulls are rigid. A refined version
of the blocks method has been developed; it was used to show that computing cores of
universal solutions for mappings whose set of target dependencies consists of egds and a
weakly acyclic set of tgds can be done in polynomial time.
Theorem 6.18
Σ t con-
sists of a set of egds and a weakly acyclic set of tgds. There is a polynomial-time algorithm
that, for every source instance S, checks whether a solution for S exists, and if that is the
case, computes the core of the universal solutions for S.
Let
M
=(R s , R t ,
Σ st ,
Σ t ) be a fixed relational mapping, such that
The proof of this result, which is based on the notion of hypertree decompositions, is
technically quite involved and goes beyond the scope of this topic.
Summary
In summary, the core is a universal solution that has several good properties for data ex-
change purposes: it is unique up to isomorphism, it is the smallest universal solution, and
it can be computed in polynomial time for mappings with a weakly acyclic set of tgds.
However, the choice of a solution to materialize does not boil down exclusively to the
size of such a solution. After all, we materialize solutions to answer queries; in addition,
we should also consider the cost of building a solution. We have already seen that it is
costlier to build the core than the canonical universal solution (as building the latter is the
first step in building the core). In addition, we shall see in the next chapter that although the
core and the canonical universal solution are indistinguishable with respect to some basic
query answering tasks, the canonical universal solution behaves better than the core when
handling more expressive query languages.
We conclude that there is a tradeoff in deciding which solution - canonical universal or
core - to materialize:
Search WWH ::




Custom Search