Agriculture Reference
In-Depth Information
Mahalanobis D 2 value as described in dis-
criminant analysis is also used in measuring
the distance between two individuals based on
multiple characters.
It must clearly be noted that some of the
above-mentioned distances are based on similar-
ity while the others are based on dissimilarity;
each of the above-mentioned distances has got
merits and demerits; not all the measures are
equally useful in all situations. Hence, proper
selection of distance measure is one of the most
important points in cluster analysis.
1.
Minkowski
distance
is
defined
as
8.
s
P p
1 X ik X jk
r
r
M ij ¼
, where
M ij
denotes
the distance between two objects
i
and
j
,
k ¼
... p
1,2,3
the number of characters, and
is the parameter chosen suitably.
2. If we select
r
Euclidean
distance measure. Thus, Euclidean distance
measure between
r ¼
2, then we get the
i th and
j th individual
is
s
P p
1 X ik X jk
2
given as
E ij ¼
.
ij ¼ P p
1
2
3.
Squared
Euclidean
distance
¼ E
2 is also used when anyone wants
to put greater weights on objects that are fur-
ther apart.
12.11.2 Clustering Technique
X ik X jk
Amalgamation of the similar individuals into
different groups on the basis of appropriate simi-
larity or dissimilarity distance measure
constitutes the clustering technique. Among the
different clustering techniques, the hierarchical
technique, partitioning technique, graphical
method, etc., are use mostly.
4.
City
block
distance
:
InMio i
distance measure, if we put
1 and take the
absolute value, we get the measure CB ij ¼ P p
1
r ¼
. This is nothing but the sum of the
distance across the
X ik X jk
dimensions between any
two objects. This distance measure is also
known as Manhattan distance measure.
p
12.11.2.1 Hierarchical Technique
In this method of clustering, each
individual is
assumed to be in n individual clusters initially.
Successive fusions take place by linking the most
similar to more similar cases or objects and so on
in the successive steps. The reverse process of
successive divisions of the cases into the most
dissimilar to more dissimilar and so on also takes
place. The special feature of this technique is that
once an object is allocated to a cluster, it is never
removed or combined with other objects belong-
ing to some other clusters. Among the different
methods of linking the objects in different
clusters, single linkage (nearest neighbor), com-
plete linkage (furthest neighbor), unweighted
pair group average (UPGMA), weighted pair
group average (WPGMA), unweighted pair
group centroid (UPGMC), weighted pair group
centroid (WPGMC), Ward's method, etc., are
important and mostly used.
(a)
n
5.
Chebychev's distance
measure: According to
this measure, two individuals are different if
they differ in anyone of the characteristics,
and as such this distance is calculated as
C ij ¼
. Thus, in this dis-
tance measure dissimilarity is the major point
of interest rather than the similarity.
Maximum
X ik X jk
6.
Power distance
measure: The power distance
s
p
1 X ik X jk
q
r
measure is defined as
M ij ¼
,
where
are the user-defined parameters.
In fact the same weight
q
and
r
is placed to each and
every dimension of Minkowski measure.
r
Number of X ik 6¼X jk
p
7.
Percent disagreement ¼
;
when the information is of categorical in
nature, then this type of measure becomes
useful. In taxonomic studies, the difference
in banding pattern in response to an enzymatic
action is used to differentiate two samples.
PD ij ¼
Single linkage
: In this method, two objects
having minimum distance (nearest neighbor)
Search WWH ::




Custom Search