Pattern-Based Schema Mapping and Query Answering in Peer-to-Peer XML Data Integration System - Advanced Database Query Systems

Database Reference

In-Depth Information

This discussion above shows that it is useful to analyze relationships between the query and func-

tional dependencies defined over the peer's schema. The analysis can influence the decision about the

propagation and merging modes.

Sufficient Condition for Merging Modes

It is quite obvious that the full merge is more expensive than the partial one. However, during full merge

more missing values can be discovered. Thus, full merge should be performed when there is a chance

to discover missing values. The following Theorem 1 states the sufficient condition when there is no

sense in applying full merge because no missing value can be discovered.

Theorem 1 . Let S ( x ) be a target schema, F ( z ) be a TP-XFD over S ( x ), where type ( F ) = type S ( x ) for some

x ∧ x . Let q be a query with qualifier φ( y ), y ∧ x , and I be an instance of S and Ans be an answer to q

received from a propagation. Then

q ( merge ( Ans , I )) = merge ( Ans , q ( I ))

(2)

holds if one of the following two conditions holds

(a) x ∧ y (i.e. the query qualifier constraints the value of the dependent path), or

(b) z ∧ y (i.e. the query qualifier constraints values of all determining paths).

Proof . The equality (2) does not hold if there are valuations ω 1 ∧ Ω Ans , ω 2 ∧ Ω I , such that ω 1 ( x ) = ∧ ,

ω 2 ( x ) ≠ ∧ , and ω 1 ( z ) = ω 2 ( z ) (see Proposition 1). Let us consider conditions ( a ) and ( b ):

1. Condition ( a ). Let x ∧ y . Let us assume that there is ω 1 ∧ Ω Ans , such that ω 1 ( x ) = ∧ . But then φ(ω 1 ( y) )

could not be true and this contradicts the assumption. Thus, the theorem holds.

2. Condition ( b ). Let z ∧ y . Let us assume that there are valuations ω 1 ∧ Ω Ans , ω 2 ∧ Ω I , such that ω 1 ( x )

= ∧ , ω 2 ( x ) ≠ ∧ , and ω 1 ( z ) = ω 2 ( z ). Then:

◦

Let φ(ω 2 ( y) ) = true . Then ω 2 must be also in Ω q ( I ) , and Ω q ( I ) can be used for discovering miss-

ing values instead of the whole Ω I . Thus the equality (2) holds.

◦

Let φ(ω 2 ( y) ) ≠ true . Because ω 1 and ω 2 must coincide on y , then also φ(ω 1 ( y) ) ≠ true and ω 1

cannot be in Ans that contradicts the assumption. Thus the equality (2) holds.

To illustrate application of the above theorem let us consider a query about John's data in peers P 2

and P 3 in Figure 3.

1. Let q be a query with the qualifier φ 2 ( x 2 ):= x 2 = “ John ” in the peer P 2 . There is also TP-XFD F 2 ( x 2 ):=

/ pubs / pub / author [ name = x 2 ]/ university specified over S 2 ( x 1 , x 2 , x 3 ). In force of Theorem 1 there is

no chance to discover any missing value of John's university. Indeed, if we obtain an answer with

university = ∧ , then the real value is either in the local answer q ( I 2 ) or it does not occur in I 2 at all.

So, in P 2 the partial merge should be performed. Performing the full merge in this case is pointless.

Let q be a query with the qualifier φ 3 ( x 1 ):= x 1 = “ John ” issued against the peer P 3 . There is TP-

XFD F 3 ( x 2 ):= / authors / author / paper [ title = x 2 ]/ year specified over S 3 ( x 1 , x 2 , x 3 ). Assumptions of

Search WWH ::

Custom Search

Home