Information Technology Reference
In-Depth Information
When we calculate the similar degrees among cases, we should consider a
weighted sum of similar degrees of all features. The similar degrees among
cases are often defined as distances. Typical distances are:
(1) Manhattan distance:
N
à =
d
=
V
V
(5.2)
ij
ik
jk
k
1
where
V ik and
V jk are the values of the
k th feature of case
i and
j ,
respectively.
(2) Euclidean distance:
N
à =
2
d
=
(
V
V
)
(5.3)
ij
ik
jk
k
1
(3) Minkowski distance
1
/
q
N
Ç
×
q
= Ã =
d
V
V
, q>0
(5.4)
É
Ù
ij
ik
jk
k
1
The above definitions are normal in that every feature influences the
similarity among two cases equally. Actually each feature contributes
differently the similar degrees, therefore we still need to add weights to
features reflecting their importance. The above definitions can be rewritten as:
N
à =
d
=
w
d
(
V
,
V
)
(5.5)
ij
k
ik
jk
k
1
N
is the weights of k th features of the cases, and normally à =
w
where
1 ;
w
=
k
1
is the distance between the i th case and the j th one on the
dimension of k th feature, which can be computed by the classical definitions,
or other definitions.
The similar degree between two cases can be defined in terms of the
above definition of distances:
and
d
(
V
ik V
,
)
jk
[
= 1
0
SIM
d
if
d
ij
ik
ij
Search WWH ::




Custom Search