Information Technology Reference
In-Depth Information
•
min
is such that
min :
N
×
N
→
N
∀
(
)
∈
(
N
×
N
)
(
)=
≥
(
)=
with
x
,
y
such that
min
x
,
y
y if and only if x
y and
min
x
,
y
xin
other cases.
•
length
is defined as follows:
length :
S
→
N
such that
∀
s
∈
S
, length
(
s
)=
x, with x the number of characters in s and where
is the set of all possible strings.
•
lengthComSubString
is defined as follows:
lengthComSubString :
S
S
2
→
N
s
,
s
)
→
∑
i
length
s
i
)
(
(
with s
i
∈
S such that the four following conditions are fulfill:
1. s
i
is a substring of both s and s
2. s
i
contains at least two characters
3. s
i
is maximal (i.e. there is no other string that fulfill the conditions and is longer)
4. the order in which the substrings appear in the two strings is preserved
As an illustration of this second measure, let us consider two titles. Their similarity is the
following one.
sim
ref
strings
(
“
The news
,“
News
)
“
The news
,“
News
)
lengthComSubString
(
=
(
(
“
The news
,“
News
))
length
min
4
=
=
1
Similarity of dates and numerical referents
To compare the numerical referents and dates given as numerical values, we rely on a distance
and a threshold, that represents the tolerance with which two values may differ atmost.
Definition 5.4.
Let t be the threshold defined by the end user. t
∈
R
+
∗
.
2
are two
(
v
1
,
v
2
)
∈
R
numerical values that must be compared.
The function
sim
ref
num
:
2
R
→
[
0, 1
]
is defined as follows:
0
if
|
v
1
−
v
2
|≥
t
sim
ref
num
(
v
1
,
v
2
)=
−
|
v
1
−
v
2
|
t
1
otherwise
5.3.1.3 Similarity regarding the context of the concepts
In order to compare the context in which the two concepts are expressed, we propose to
compare their immediate neighborhood. Intuitively, the similarity measure of two concepts
regarding their context is processed by measuring the proportion of relations linked to the
concepts and that have the same type and the proportion of relations that have different types.
Definition 5.5.
The similarity of a node c
1
of the graph G
1
and the node c
2
of the graph G
2
, regarding
their neighborhood is given by the function
sim
context
:
2
defined as follows.
Let R
1
(respectively R
2
) be the set of relations neighboring the concept node c
1
(respectively c
2
). We
define R
1
C
→
[
0, 1
]
(respectively R
2
), the union of the set R
1
(resp. R
2
) and set containing the empty element
∅
noted
.
be a symmetric relation between the elements of R
1
and R
2
R
Let
such that
Search WWH ::
Custom Search