Information Technology Reference
In-Depth Information
The similarity measure of Rogers and Tanimoto (abbreviated R / T ) has been used as
an affinity measure [13]. We will examine the underlying distance function to see if it
is a metric. R / T uses the following four auxiliary functions:
(
)
(
(
)
)
a
x
,
y
=
ones
AND
x
,
y
(
)
(
(
( )
)
)
b
x
,
y
=
ones
AND
x
,
NOT
y
(13)
(
)
(
(
( )
)
)
,
,
c
x
y
=
ones
AND
NOT
x
y
(
)
(
(
( )
( )
)
)
d
x
,
y
=
ones
AND
NOT
x
,
NOT
y
With these functions the measure is defined as
(
)
(
)
a
x
,
y
+
d
x
,
y
(
)
(14)
R
T
x
,
y
=
(
)
(
)
(
(
)
(
)
)
a
x
,
y
+
d
x
,
y
+
2
b
x
,
y
+
c
x
,
y
This function has some properties which show that it cannot be interpreted as a
distance function. 1 First, since for all
ϕ
{ a , b , c , d } it holds
ϕ
( x , y )
0, it follows
that 0
1, i.e. the value domain is normalized, which seems unnatural for
a distance function. Second, R / T ( x , x ) = 1 because a ( x , x ) + d ( x , x ) = n and b ( x , x ) =
c ( x , x ) = 0. Thus, R / T does not even satisfy condition (ii) of a distance function. Third,
R / T ( x , y ) = 0 if and only if a ( x , y ) = d ( x , y ) = 0, again because
R / T ( x , y )
{ a , b , c , d }. (Notice that in this case b ( x , y ) + c ( x , y ) > 0.) But a ( x , y ) = 0 requires that
there is no position where both, x and y , have a 1, correspondingly for d ( x , y ) = 0, i.e.
x and y are complementary.
These observations lead to the following definition of R / T : In the numerator of the
function all positions are counted where x and y are equal (either 0 or 1). The same
value can be achieved if first XOR is applied, then NOT , and then the number of 1's in
the result is counted. In the denominator the same value occurs augmented by the
number of positions where x and y are different. This gives
ϕ
( x , y )
0 for all
ϕ
(
(
(
)
)
)
ones
NOT
XOR
x
,
y
(
)
R
T
x
,
y
=
(
(
(
)
)
)
(
(
)
)
ones
NOT
XOR
x
,
y
+
2
ones
XOR
x
,
y
(
(
)
)
n
ones
XOR
x
,
y
=
(15)
(
(
)
)
(
(
)
)
n
ones
XOR
x
,
y
+
2
ones
XOR
x
,
y
(
(
)
)
n
ones
XOR
x
,
y
=
(
(
)
)
n
+
ones
XOR
x
,
y
(
)
n
d
x
,
y
=
XOR
(
)
n
+
d
x
,
y
XOR
where n is the length of the binary strings. In order to prove the equivalence of the
two definitions (14) and (15) we have to show that a ( x , y ) + d ( x , y ) = n
ones ( XOR ( x , y )) and b ( x , y ) + c( x , y ) = ones ( XOR ( x , y )). a ( x , y ) + d ( x , y ) is the
1 Actually, Rogers and Tanimoto intended to define a similarity measure [15] which is more or
less the opposite of a distance function.
 
Search WWH ::




Custom Search