Databases Reference
In-Depth Information
Maximize: 0 . 88 x 1 +0 . 75 x 2 +0 . 7 x 3 +0 . 54 x 4
Subject to:
x 2 + x 4 1
(10)
x 1 + x 3
1
(11)
x 1 + x 2
1
(12)
x 2 + x 3
1
(13)
The a-priori confidence values of the potential correspondences are factored in
as coe cients of the objective function. Here, the ILP constraint (9) corresponds
to ground formula (5), and ILP constraints (10),(11), and (12) correspond
to the coherence ground formulas (6), (7), and (8), respectively. An optimal
solution to the ILP consists of the variables x 1 and x 4 corresponding to the
correct alignment
{
m c ( b 1 ,b 2 ) ,m c ( d 1 ,e 2 )
}
. Compare this with the alignment
{
map c ( b 1 ,b 2 ) ,map c ( c 1 ,e 2 ) ,map p ( p 1 ,p 2 )
}
which would be the outcome without
coherence constraints.
6 Markov Logic and Object Reconciliation
We are primarily concerned with the scenario where both A-Boxes are described
in terms of the same T-Box. This is a reasonable assumption if we want to
integrate information extraction projects with a limited set of classes and re-
lations. In this case, it is often straight-forward to align the ontologies of the
involved knowledge bases and to exploit these links for improving the alignment
between individuals. However, for open information extraction projects, where
the number of relations is not bounded and essentially every surface form could
correspond to a particular relation, this approach is not feasible and has to be
substituted with one that aligns relations and indidivduals jointly. We will not
discuss this scenario here.
Instead, we present an approach that does not rely on specific types of axioms
or a set of predefined rules but computes the alignment by maximizing the
similarity of the two knowledge bases given the alignment subject to a set of
constraints. Our method factors in a-priori confidence values that quantify the
degree of trust one has in the correctness of the object correspondences based
on lexical properties. The resulting similarity measure is used to determine an
instance alignment that induces the highest agreement of object assertions in
A 1 and
A 2 with respect to
T
.
6.1 Problem Representation
The current instance matching configuration leverages terminological structure
and combines it with lexical similarity measures. The approach is presented in
more detail in [75]. The alignment system uses one T-Box
T
but two different
A-Boxes
A 1 ∈O 1 and
A 2 ∈O 2 . In cases with two different T-Boxes the T-Box
 
Search WWH ::




Custom Search