Biology Reference
In-Depth Information
z-scores:
1.98
1.66
1.43
2.61
2.26
1.91
1.26
0.09
0.11
2
3
4
5
6
7
8
9
10
Number of Clusters, k
Fig. 15.6. Average silhouette coefficients vs. cluster size k computed from variables V 1 and V 3
(solid circles), together with boxplots of the ranges of average silhouette coefficients obtained for 100
independent random permutations of V 3 .
tion values, it does not appear that the variables V 1 and V 3 alone provide the basis
for a useful clustering of the data.
Including each of the nine remaining variables one by one, the best results
are obtained when V 2 is added to the variable set. These results are shown in
Fig. 15.7, which shows that both S 0 (3) and S 0 (4) exceed all of their associated
random permutation values. Of these two results, the three-cluster partitioning
exhibits both the larger silhouette coefficient and the larger z -score, so it is taken
here as the basic clustering , which will serve as a reference case for all other
clusterings of this dataset.
The results obtained by adding one variable at a time are summarized in Ta-
ble 15.3, including the cases illustrated in Figs. 15.6 and 15.7. The optimum
number of clusters is k =3in all cases, with the possible exception of the five-
variable clustering on V 1 , V 2 , V 3 , V 7 ,and V 9 ,where k =4achieves a slightly
larger silhouette coefficient but a slightly smaller z -score. Detailed descriptions
of these clusterings and their differences are given in Sec. 15.7, but two points
are worth noting. First, the silhouette coefficient values S for k =3decrease
monotonically as additional variables are included, while the associated z -scores
Search WWH ::




Custom Search