Database Reference
In-Depth Information
6.7.2 Methodology
Performance of the algorithms on all the datasets has been analyzed using
mutual information (MI) between the cluster and class labels. MI quantifies
the amount of statistical similarity between the cluster and class labels (16).
If X is a random variable for the cluster assignments and Y is a random
variable for the pre-existing labels on the same data, then their MI is given
by I ( X ; Y )= E [ln p ( X,Y )
p ( X ) p ( Y ) ] where the expectation is computed over the joint
distribution of ( X, Y ) estimated from a particular clustering of the dataset
under consideration. To facilitate computing MI, for soft-moVMF we “harden”
the clustering produced by labeling a point with the cluster label for which it
has the highest value of posterior probability (ties broken arbitrarily). Note
that variants of MI have been used to evaluate clustering algorithms by several
researchers. The authors of (32) used a related concept called variation of
information to compare clusterings. An MDL-based formulation that uses
the MI between cluster assignments and class labels was proposed by (22).
All results reported herein have been averaged over 10 runs. All algorithms
were started with the same random initialization to ensure fairness of compar-
ison. Each run was started with a different random initialization. However,
no algorithm was restarted within a given run and all of them were allowed to
run to completion. Since the standard deviations of MI were reasonably small
for all algorithms, to reduce clutter, we have chosen to omit a display of error
bars in our plots. Also, for practical reasons, the estimate of κ was upper
bounded by a large number (10 4 , in this case) in order to prevent numeric
overflows. For example, during the iterations, if a cluster has only one point,
theestimateof κ will be infinity (a divide by zero error). Upper bounding
theestimateof κ is similar in flavor to ensuring the non-singularity of the
estimated covariance of a multivariate Gaussian in a mixture of Gaussians.
6.7.3 Simulated Datasets
First, to build some intuition and confidence in the working of our vMF
based algorithms we exhibit relevant details of soft-moVMF 's behavior on the
small-mix dataset shown in Figure 6.4(a) .
The clustering produced by our soft cluster assignment algorithm is shown
in Figure 6.4(b). The four points (taken clockwise) marked with solid circles
have cluster labels (0 . 15 , 0 . 85), (0 . 77 , 0 . 23), ( . 82 ,. 18), and ( . 11 ,. 89), where a
cluster label ( p, 1
p ) for a point means that the point has probability p of
belonging to Cluster 1 and probability 1
p of belonging to Cluster 2. All
other points are categorized to belong to a single cluster by ignoring small
(less than 0 . 10) probability values.
The confusion matrix, obtained by “hardening” the clustering produced
by soft-moVMF for the small-mix dataset, is 26 1
023
.As sevident rom
this confusion matrix, the clustering performed by soft-moVMF is excellent,
Search WWH ::




Custom Search