Information Technology Reference
In-Depth Information
The similarity measure of Rogers and Tanimoto (abbreviated
R
/
T
) has been used as
an affinity measure [13]. We will examine the underlying distance function to see if it
is a metric.
R
/
T
uses the following four auxiliary functions:
(
)
(
(
)
)
a
x
,
y
=
ones
AND
x
,
y
(
)
(
(
( )
)
)
b
x
,
y
=
ones
AND
x
,
NOT
y
(13)
(
)
(
(
( )
)
)
,
,
c
x
y
=
ones
AND
NOT
x
y
(
)
(
(
( )
( )
)
)
d
x
,
y
=
ones
AND
NOT
x
,
NOT
y
With these functions the measure is defined as
(
)
(
)
a
x
,
y
+
d
x
,
y
(
)
(14)
R
T
x
,
y
=
(
)
(
)
(
(
)
(
)
)
a
x
,
y
+
d
x
,
y
+
2
b
x
,
y
+
c
x
,
y
This function has some properties which show that it cannot be interpreted as a
distance function.
1
First, since for all
ϕ
∈
{
a
,
b
,
c
,
d
} it holds
ϕ
(
x
,
y
)
≥
0, it follows
that 0
1, i.e. the value domain is normalized, which seems unnatural for
a distance function. Second,
R
/
T
(
x
,
x
) = 1 because
a
(
x
,
x
) +
d
(
x
,
x
) =
n
and
b
(
x
,
x
) =
c
(
x
,
x
) = 0. Thus,
R
/
T
does not even satisfy condition (ii) of a distance function. Third,
R
/
T
(
x
,
y
) = 0 if and only if
a
(
x
,
y
) =
d
(
x
,
y
) = 0, again because
≤
R
/
T
(
x
,
y
)
≤
{
a
,
b
,
c
,
d
}. (Notice that in this case
b
(
x
,
y
) +
c
(
x
,
y
) > 0.) But
a
(
x
,
y
) = 0 requires that
there is no position where both,
x
and
y
, have a 1, correspondingly for
d
(
x
,
y
) = 0, i.e.
x
and
y
are complementary.
These observations lead to the following definition of
R
/
T
: In the numerator of the
function all positions are counted where
x
and
y
are equal (either 0 or 1). The same
value can be achieved if first
XOR
is applied, then
NOT
, and then the number of 1's in
the result is counted. In the denominator the same value occurs augmented by the
number of positions where
x
and
y
are different. This gives
ϕ
(
x
,
y
)
≥
0 for all
ϕ
∈
(
(
(
)
)
)
ones
NOT
XOR
x
,
y
(
)
R
T
x
,
y
=
(
(
(
)
)
)
(
(
)
)
ones
NOT
XOR
x
,
y
+
2
⋅
ones
XOR
x
,
y
(
(
)
)
n
−
ones
XOR
x
,
y
=
(15)
(
(
)
)
(
(
)
)
n
−
ones
XOR
x
,
y
+
2
⋅
ones
XOR
x
,
y
(
(
)
)
n
−
ones
XOR
x
,
y
=
(
(
)
)
n
+
ones
XOR
x
,
y
(
)
n
−
d
x
,
y
=
XOR
(
)
n
+
d
x
,
y
XOR
where
n
is the length of the binary strings. In order to prove the equivalence of the
two definitions (14) and (15) we have to show that
a
(
x
,
y
) +
d
(
x
,
y
) =
n
−
ones
(
XOR
(
x
,
y
)) and
b
(
x
,
y
) + c(
x
,
y
) =
ones
(
XOR
(
x
,
y
)).
a
(
x
,
y
) +
d
(
x
,
y
) is the
1
Actually, Rogers and Tanimoto intended to define a similarity measure [15] which is more or
less the opposite of a distance function.