Graphics Reference
In-Depth Information
for which those two are the closest and second-closest nodes; see also Martinetz and
Schulten (
). Let
c
(
x
)=
arg min
c
C
K
c
(
x
)
d
(
x,
c
)
(
.
)
denote the second-closest centroid to
x
,andlet
c
i
, c
A
ij
=
x
n
c
(
x
n
)=
(
x
n
)=
c
j
(
.
)
be the set of all points where
c
i
is the closest centroid and
c
j
is the second-closest.
Now the shadow value s
(
x
)
for each observation
x
is defined as
d
(
x
, c
(
x
))
s
(
x
)=
(
.
)
d
x, c
x
d
x
, c
x
(
(
)) +
(
(
))
If s
iscloseto
,itis
almost equidistant fromthe two centroids. hus, a clusterthat is well separated from
all other clusters should have many points with small s values. he average shadow
value of all points where cluster i is closest and j is second-closest can be used as
a simple measure of cluster proximity:
(
x
)
iscloseto
,thenthepointisclosetoitsclustercentroid;if s
(
x
)
−
A
i
x
A
ij
s
(
x
)
, A
ij
s
ij
=
(
.
)
,
A
ij
=
If s
ij
,thenatleastonedatapointinsegment i has
c
j
asitssecond-closestcentroid,
and segments i and j have a common border. If s
ij
is close to
, then most points in
segment i arealmostequidistantfrom
c
i
and
c
j
andtheclustersarenotseparatedvery
well.Finally, if s
ij
isclose to
,then those points that are “between” segments
i and j are almost equidistant from the two centroids. A denominator of
A
ij
A
i
A
i
rather
than
isused sothat asmallset A
ij
consisting of only badly clusteredpoints with
large shadowvalues does not induce large clustersimilarity. hegraph with nodes
c
k
and edge weights s
ij
isadirectedgraph;tosimplifymattersweusethecorresponding
adirected graph with average values of s
ij
and s
ji
asedgeweightsinthischapter.
Figures
.
,
.
, and
.
all contain the same graph using different projections,
all of which show the linear structure of the four western clusters. he projection in
Fig.
.
may give the misleading impression that clusters three and five overlap; the
missing connection between the two nodes of the graph indicates correctly that this
is an artefact of this particular projection.
A
ij
Cluster Silhouettes
11.3.4
Forhigh-dimensionaldataitcanbehard(orevenimpossible)tocheckfromanytwo-
dimensionalprojectionofthedatawhetherclustersofpointsarewellseparated.From
Fig.
.
we know that cluster three is separated from the others, but the remaining
fourclustersmayeither splitintoawidecontinuum ofelectoraldistricts, ortheymay
be separated from each other in a direction orthogonal to the projection.