Information Technology Reference
In-Depth Information
Regardless of the representation, a matching rule M may be defi ned in terms
of an a nity measure. Accordingly, d M x that denotes d matches x is defi ned as
d M x if and only if x “belongs” to the set defi ned by the detector d .
h e quotations mean that it is necessary to defi ne a notion of a point belonging
to a set. In classical set theory, a point either belongs to a set or does not; however,
a diff erent notion like the one in fuzzy set theory may be used, where a degree of
membership to the universal set is defi ned (Gonzalez, 2003).
3.3 String-Matching Rules
A matching rule, which defi nes “matching” or “recognition,” and the distance
measure that the former is based on are the cornerstones in any detection,
classifi cation, or recognition algorithms. h e choice of a matching rule depends on
the representation scheme and type of data. For instance, if you are dealing with
categorical data, then a string representation may be more suitable, and a matching
rule such as r -contiguous bits ( rcb ) or r -chunks can be used (Percus et al., 1993). In
this subsection, several string-matching rules are described in detail.
3.3.1 Hamming Distance
h e Hamming distance between two strings is defi ned as the number of diff erent
characters between the two strings. h e Hamming distance h ( x,y ) between two
strings x and y is expressed as
N
(
)
1
hx y
(, )
X
Y
i
i
i
where N is the string length, X i and Y i denote the i th bit of string n and y , respec-
tively, X i
Y i the Xor logic operation, when dealing with binary strings; for other
alphabets, X i
Y i is zero if the two symbols are equal and one otherwise.
3.3.2 Binary Distance
Some extensions of the Hamming distance, for binary strings, have been proposed
based on the relative number of bits that match or diff er; such extensions are based
on the following basic measures:
1
,
if
XY
1
N
i
i
a
,
i
i
0
,
otherwise
i
1
Search WWH ::




Custom Search