Biology Reference
In-Depth Information
z-scores:
1.98
1.66
1.43
2.61
2.26
1.91
1.26
0.09
0.11
2
3
4
5
6
7
8
9
10
Number of Clusters, k
Fig. 15.6. Average silhouette coefficients vs. cluster size
k
computed from variables
V
1
and
V
3
(solid circles), together with boxplots of the ranges of average silhouette coefficients obtained for 100
independent random permutations of
V
3
.
tion values, it does not appear that the variables
V
1
and
V
3
alone provide the basis
for a useful clustering of the data.
Including each of the nine remaining variables one by one, the best results
are obtained when
V
2
is added to the variable set. These results are shown in
Fig. 15.7, which shows that both
S
0
(3) and
S
0
(4) exceed all of their associated
random permutation values. Of these two results, the three-cluster partitioning
exhibits both the larger silhouette coefficient and the larger
z
-score, so it is taken
here as the
basic clustering
, which will serve as a reference case for all other
clusterings of this dataset.
The results obtained by adding one variable at a time are summarized in Ta-
ble 15.3, including the cases illustrated in Figs. 15.6 and 15.7. The optimum
number of clusters is
k
∗
=3in all cases, with the possible exception of the five-
variable clustering on
V
1
,
V
2
,
V
3
,
V
7
,and
V
9
,where
k
∗
=4achieves a slightly
larger silhouette coefficient but a slightly smaller
z
-score. Detailed descriptions
of these clusterings and their differences are given in Sec. 15.7, but two points
are worth noting. First, the silhouette coefficient values
S
∗
for
k
=3decrease
monotonically as additional variables are included, while the associated
z
-scores
Search WWH ::
Custom Search