Cluster Analysis - Visual Data Mining: The VisMiner Approach

Databases Reference

In-Depth Information

X

2

SSE ¼

distðc; xÞ

x2C

where x is an observation in cluster C and c is the cluster centroid. If all

observations are tightly packed around the centroid, the SSE is relatively low.

When observations are spread, the SSE is greater.

Since in a clustering, individual clusters will vary in size (number of

observations), SSE will generally be larger in clusters containing more obser-

vations. To directly compare cohesiveness between clusters we compute the

mean squared error (MSE) of a cluster as:

MSE ¼ SSE

m

where m is the number of observations belonging to the cluster. A special case

to be aware of is the single observation cluster where SSE and MSE will

always be zero.

To compare clusterings - that is clusterings generated by different proximity

based clustering algorithms or multiple executions of the same algorithm - with

respect to overall cluster cohesion we compute the total sum of the squared error

(TSSE) as:

X

TSSE ¼

SSE c

c2CL

where c is a cluster within the full clustering CL . Be forewarned in comparing

clusterings with significantly different cluster counts, the greater the number of

clusters in a clustering, the lower the TSSE. At the extreme, a clustering of one

observation per cluster has a TSSE of zero. Certainly one would not expect this

to be a useful clustering.

An overall measure of cluster separation is the total “between group” sum of

squares (TSSB). TSSB is the sum of the squared distance of cluster centroids

from the dataset overall mean (the dataset centroid) weighted by the number of

observations in the cluster. It is computed as

K

2

TSSB ¼

1 m i

distðc i ; cÞ

i¼

where m i

is the number of observations in cluster

i; K is the total number of

clusters,

c i is the centroid of cluster i , and c is the overall dataset centroid. When

comparing clusterings, the greater the TSSB of the clustering, the better the

separation.

Visual Data Mining: The VisMiner Approach

Search WWH ::

Custom Search

Home