Digital Signal Processing Reference
In-Depth Information
In this sense, the main objective of cluster validity is to determine the
optimal number of clusters that provide the best characterization of a
given multidimensional data set. An incorrect assignment of values to
the parameter of a clustering algorithm results in a data-partitioning
scheme that is not optimal, and thus leads to wrong decisions.
In this section, we evaluate the performance of the clustering tech-
niques in conjunction with three cluster validity indices: Kim's index,
the Calinski-Harabasz (CH) index, and the intraclass index. These in-
dices were successfully applied earlier in biomedical time-series analysis
[97]. In the following, we describe the above-mentioned indices.
Calinski-Harabasz index : [39]: This index is computed for m data
points and K clusters as
[trace B/ ( K
1)]
CH =
(6.46)
[trace W/ ( m
K )]
where B and W represent the between- and within-cluster scatter ma-
trices.
The maximum hierarchy level is used to indicate the correct number
of partitions in the data.
Intraclass index [97]: This index is given as
K
n k
1
n
2
I W =
||
x i
w k ||
(6.47)
k=1
i=1
where n k is the number of points in cluster k and w k is a prototype
associated with the k th cluster. I W is computed for different cluster
numbers. The maximum value of the second derivative of I W as a
function of cluster number is taken as an estimate for the optimal
partition. This index provides a possible way of assessing the quality
of a partition of K clusters.
Kim's index [138]: This index equals the sum of the overpartition
v o ( K, X , W ), and the underpartition v u ( K, X , W ) function measure
I Kim = v u ( K )
v umin
+ v o ( K )
v omin
.
(6.48)
v umax
v umin
v omax
v omin
where v u ( K ) is the underpartitioned average over the cluster number of
the mean intracluster distance, and measures the structural compactness
Search WWH ::




Custom Search