Information Technology Reference
In-Depth Information
min is such that
min :
N × N N
(
) ( N × N )
(
)=
(
)=
with
x , y
such that min
x , y
y if and only if x
y and min
x , y
xin
other cases.
length is defined as follows:
length :
S N
such that
s
S
, length
(
s
)=
x, with x the number of characters in s and where
is the set of all possible strings.
lengthComSubString is defined as follows:
lengthComSubString :
S
S
2
N
s , s ) i length
s i )
(
(
with s i
S such that the four following conditions are fulfill:
1. s i is a substring of both s and s
2. s i contains at least two characters
3. s i is maximal (i.e. there is no other string that fulfill the conditions and is longer)
4. the order in which the substrings appear in the two strings is preserved
As an illustration of this second measure, let us consider two titles. Their similarity is the
following one.
sim ref strings (
The news ,“ News )
The news ,“ News )
lengthComSubString
(
=
(
(
The news ,“ News ))
length
min
4
=
=
1
Similarity of dates and numerical referents
To compare the numerical referents and dates given as numerical values, we rely on a distance
and a threshold, that represents the tolerance with which two values may differ atmost.
Definition 5.4. Let t be the threshold defined by the end user. t
R + .
2 are two
(
v 1 , v 2 ) R
numerical values that must be compared.
The function sim ref num :
2
R
[
0, 1
]
is defined as follows:
0 if
|
v 1
v 2
|≥
t
sim ref num (
v 1 , v 2 )=
| v 1 v 2 |
t
1
otherwise
5.3.1.3 Similarity regarding the context of the concepts
In order to compare the context in which the two concepts are expressed, we propose to
compare their immediate neighborhood. Intuitively, the similarity measure of two concepts
regarding their context is processed by measuring the proportion of relations linked to the
concepts and that have the same type and the proportion of relations that have different types.
Definition 5.5. The similarity of a node c 1 of the graph G 1 and the node c 2 of the graph G 2 , regarding
their neighborhood is given by the function sim context :
2
defined as follows.
Let R 1 (respectively R 2 ) be the set of relations neighboring the concept node c 1 (respectively c 2 ). We
define R 1
C
[
0, 1
]
(respectively R 2 ), the union of the set R 1 (resp. R 2 ) and set containing the empty element
noted
.
be a symmetric relation between the elements of R 1
and R 2
R
Let
such that
 
Search WWH ::




Custom Search