Graphics Reference
In-Depth Information
Introduction
11.1
Data visualization can greatly enhance our understanding of multivariate data struc-
tures, and so it is no surprise that cluster analysis and data visualization oten go
hand in hand,andthat textbooks like Gordon ( )orEveritt et al.( )are full of
figures. In particular, hierarchical cluster analysis is almost always accompanied by
adendrogram.Resultsfrompartitioningclusteranalysiscanbevisualizedbyproject-
ing the data into two-dimensional space or using parallel coordinates. Cluster mem-
bership is usually represented by different colors and glyphs, or by dividing clusters
intoseveralpanelsofatrellisdisplay(Beckeretal., ).Inaddition,silhouette plots
(Rousseeuw, ) provide a popular tool for diagnosing the quality of a partition.
Some of the popularity of self-organizing feature maps (Kohonen, ) with prac-
titioners in various fields can be explained by the fact that the results can be “easily”
visualized.
Inthischapterweprovideanoverviewofvisualization techniquesforclusteranal-
ysis results. Using two real-world data sets, we explain the most important types of
graphs that can be used in combination with hierarchical, partitioning and model-
based cluster analysis. Many plots like dendrograms, convex cluster hulls or silhou-
ettes are specific to clustering, but we also demonstrate how graphical techniques
introduced in other chapters of this handbook can be used as building blocks for
cluster visualization.
The Data Sets
11.1.1
Two data sets are used throughout this chapter. he “dentitio” data set is used for
hierarchical clustering (e.g., Hartigan, ). his data set gives the counts for eight
kind of teeth - top-jaw and bottom-jaw counts for incisors, canines, premolars and
molars - in different species of animals. A subset of the raw data is listed in
Table . .
he second data set, whichis used for partitioning and model-based clustering in
Sects. . and . ,is related to the German parliamentary elections of September ,
.AsubsetoftherawdataisgiveninTable . .hedataconsistoftheproportions
of the “second votes” obtained by the five parties that got elected to the Bundestag
(the first chamber of the German parliament) for each of the electoral districts.
he “second votes” are actually more important than the “first votes” because they
control the number of seats each party has in parliament. Note that the proportions
do not sum to because parties that did not get elected into parliament have been
omitted from the table.
Before election day, the German government comprised a coalition of Social
Democrats (SPD) and the Green Party (GRUENE); their main opposition consisted
of the conservative party (Christian Democrats, CDU/CSU) and the Liberal Party
(FDP). he latter two intended to form a coalition ater the election if they gained
a joint majority, so the two major “sides” during the campaign were SPD+GRUENE
versus CDU/CSU+FDP. In addition, a new “party of the let” (LINKE) canvassed for
Search WWH ::




Custom Search