Databases Reference
In-Depth Information
∈
Matching Based on Exemplars.
The instances associated with an exemplar
e
E
in
step 7 of Algorithm 1 are stored in a list
L
e
sorted in descending order with respect to
the distance to
e
.Let
e
m
be the elements of the list
L
e
. The goal of matching an
instance
s
from a source knowledge base to a target knowledge base w.r.t. a metric
m
is
to find all instances
t
of the target knowledge source such that
m
(
s
λ
e
1
...λ
is
a given threshold. LIMES achieves this goal by using the matching algorithm based on
exemplars shown in Algorithm 2.
,
t
)
≤ θ
,where
θ
Algorithm 2 .
LIMES' Matching algorithm
Require:
Set of exemplars
E
Require:
Instance
s
∈
S
Require:
Metric m
Require:
threshold
θ
1.
M
=∅
for
e
∈|
E
|
do
if
m
(
s
,
e
)
≤θ
then
2.
M
=
M
∪{
e
}
for
i
=
1
...|
L
e
|
do
if
(
m
(
s
,
e
)
−
m
(
e
,λ
e
i
))
≤θ
then
if
m
(
s
,λ
i
)
≤θ
then
3.
M
=
M
∪{
e
(
s
,λ
e
i
}
)
end if
else
break
end if
end for
end if
end for
return
M
LIMES only carries out a comparison when the approximation of the distance is less
than the threshold. Moreover, it terminates the similarity computation for an exemplar
e
as soon as the first
λ
e
is found such that the lower bound of the distance is larger than
θ
.
,
−
,λ
e
>θ
This break can be carried out because the list
L
e
is sorted, i.e., if
m
(
s
e
)
m
(
e
i
)
holds for the
i
th
λ
e
j
∈
element of
L
e
, then the same inequality holds for all
L
e
with
j
>
i
. In the worst case, LIMES' matching algorithm has the time complexity
O
(
|
S
||
T
|
),
leading to a total worst time complexity of
O
((
), which is larger than that of
brute force approaches. However, as the results displayed in Figure 18 show, a correct
parameterization of LIMES leads to significantly smaller numbers of comparisons and
runtimes.
|
E
|+|
S
|
)
|
T
|
3
Algorithm
HR
5.5
The
Let
S
resp.
T
be the source and target of a Link Discovery task. One of the key
ideas behind time-e
cient Link Discovery algorithms
A
is to reduce the number of