Databases Reference
In-Depth Information
Proof. The correctness is immediately discernible from the construction of the MD algorithm and
Definition 5.10 .As F is assumed to be computable in time linear in the number of F 's parameters,
generating M takes time O( max (n, n ) 2 m) . Thus, the overall complexity of the MD algorithm is
O( max (n, n ) 2 m + ) .
5.3.2 MATRIX-DIRECT ALGORITHM WITH BOUNDING
We now extend the MD algorithm to deal with non-commutative functions.
Consider a set of similarity matrices M ( 1 ) ,...,M (m)
over a pair of schemas S ,
Definition 5.12
f,F
f ,F
f ,F
S , and two sets of local and global aggregators
and
. We say that
dominates
f,F
f ,F f,F
) if, for every matching σ from S to S ,
on M ( 1 ) ,...,M (m)
(denoted as
we have:
f ,F
f,F
(σ )
(σ ).
(5.4)
f ,F
Likewise, if Eq. 5.4 holds for all possible sets of similarity matrices, then we say that
strongly
f,F
dominates
.
Consider
a
schema
metamatching
problem
defined
by
a
set
of
similarity
matrices
f,F
M ( 1 ) ,...,M (m)
and a set of local and global aggregators
that do not
commute on
M ( 1 ) ,...,M (m) . Suppose that there exists a pair of functions
h, H
that (i) do commute on
f,F
M ( 1 ) ,...,M (m) , and (ii) dominate
on these matrices. Corollary 5.13 , which follows im-
mediately from the definition of the MD algorithm, defines a simple property of this algorithm that
provides some intuition for the subsequent construction steps.
Corollary 5.13
Given a set of schema matchers and a pair of local and global aggregators
h, H
commuting on M ( 1 ) ,...,M (m) , the top-K result of the MD algorithm with respect to
h, H
is a correct
f,F
top-K aggregation with respect to any set of local and global aggregators
, such that both
h, H
f,F
f,F h, H
hold on M ( 1 ) ,...,M (m) .
and
In general, nothing prevents Corollary 5.13 from being realized. To illustrate, consider the fol-
lowing set of four real-valued functions: f(x)
x 2 , F (x)
x 2 / 2 , H (x)
x . While
f and F do not commute on reals ( F(f(x)) = x 2 / 2 and f(F(x)) = x 2 / 4), the functions h and H
are strongly commutative ( H (h(x))
=
=
x/ 2 , h(x)
=
=
x 2 / 2), and we have H (h(x))
F(f(x)) . Corol-
lary 5.13 provides us with some useful intuitions. Consider a schema metamatching problem defined
by a set of similarity matrices M ( 1 ) ,...,M (m)
=
h(H (x))
=
=
f,F
that do not
commute on M ( 1 ) ,...,M (m) . It is worth noting that we allow different matchers to use different
local aggregators and in such cases,
and local and global aggregators
f,F
is (trivially) defined not to commute on M ( 1 ) ,...,M (m) .
that do commute on M ( 1 ) ,...,M (m)
Suppose there exists a pair of functions
h, H
and dominate
Search WWH ::




Custom Search