Information Technology Reference
In-Depth Information
Fig. 4.1
Some
distributions
of
the
sources
used
in
hierarchical
clustering.
a
Laplacian,
b uniform, c K-type m = 10, d Rayleigh
!
log a u ij
PE ¼ N P
N
P
nc
coefficient
u ij
were estimated [ 25 ]. The partition
i ¼ 1
j ¼ 1
coefficient and the partition entropy both tend towards monotone behaviour
depending on the number of clusters. Therefore, to find the optimum number of
clusters, the number where the entropy value lies below the rising trend and where
the value for the partition coefficient lies above the falling trend is selected. The
point of the curve of all the connected values can be identified as a kink (''elbow
criterion'') where the optimum number of clusters is located. Figure 4.4 shows the
evolution of the above coefficients through the clustering levels of data of
Figure 4.2 .
The optimum partitioning of the clusters applies at that point of the dendrogram
that has a value of h to obtain the highest cluster differentiation (maximum of
inter-cluster mean distances) with good homogeneity within cluster members
(minimum of distances between members of the clusters and centroids). From
Search WWH ::




Custom Search