Graphics Reference
In-Depth Information
well-separatedfromtheotherclusters.Silhouettevaluesthatareclosetozeroindicate
points that are almost equidistant between two clusters, and points with negative
silhouette values may considered to be in the “wrong” cluster.
Figure . shows a silhouette plot for our five clusters. Not surprisingly, cluster
three has the largest average silhouette value of . , but cluster five is close with
. , which is not obvious from Fig. . . In fact, the only cluster which is not well-
separated fromtheothersisclusterone,withanaveragesilhouette of . andseveral
points with negative values.
Cluster Location and Dispersion
11.3.5
he silhouette plot indicates that the clusters are actually more separated from each
other than the projection onto the first two principal components in Fig. . sug-
gests. Hence, we need more information on the actual location and dispersion of the
clusters. We could now start to look at projections onto other principal components
than the first two, scatterplot matrices of the original variables, etc. Two alternative
approaches are to plot all dimensions at once or to use asymmetric projections that
maximize cluster separation.
hesimplestsolutionthatusesallvariablesistovisualizeonlytheclustercentroids
and to ignore how the data points scatter around the centroids. Figure . shows
theclustercentroidsasbarsinabarplotandplotsthepopulationcentroid(herethe
mean value) as dots for comparison. his is appropriate for this kind of data because
Figure . . Barplot of the five-cluster medoids in comparison to the overall population mean. he
numbers in the title strips of the panels give the absolute and relative number of points in each cluster
Search WWH ::




Custom Search