Biology Reference
In-Depth Information
know the clustering result before running the clustering algorithm. Priebe and
Marchette [26] address this problem by using a nonparametric method. Interested
readers are referred to it for details.
The readers should be noted about the analogy between
K
-means and E-M
algorithms. Their analogy is maximized when
d
M
is used as the dissimilarity
measure between two objects in
K
-means algorithm. So, in the following discus-
sions about the analogy between these two algorithms, we assume that
d
M
defined
in Eq. 5.3 is the dissimilarity measure for the
K
-means algorithm.
The initialization steps of these two algorithms are similar. In
K
-means al-
gorithm, randomly selecting
K
objects as the initial centers of the
K
clusters is
analogous to the E-step in the first iteration of the E-M algorithm. In the E-M
algorithm,
K
mean vectors
µ
j
's are randomly selected, which is just the initializa-
tion of the
K
cluster centers in the
K
-means algorithm. We also randomly select
K
variance-covariance matrices Σ
j
,
j
=1, 2,...,
K
in the E-M algorithm, which is the
initialization of the
K
variance-covariance matrices in the
K
-mean algorithm.
The analogy also exists in assigning each object to a cluster. In Eq. 5.13,
cluster label of object
i
at iteration
t
is determined by the maximum similarity
or the minimum dissimilarity with the cluster center. In Eq. 5.16, it is deter-
mined by the maximum likelihood of the object to distribution with mean
µ
j
and
variance-covariance matrix Σ
j
.
To calculate the maximum likelihood, the term
µ
j
)
Σ
−
1
(
x
i
−
µ
j
) in Eq. 5.16 is just the Mahalanobis distance, which is used
as the dissimilarity measure in the
K
-means algorithm.
Another analogy between these two algorithms is the update of the cluster
centers (mean vectors in E-M algorithms) and the variance-covariance matrices.
In both algorithms, the cluster centers and the variance-covariance matrices are
updated by the mean vectors and the variance-covariance matrices of the objects
assigned to the same cluster, respectively.
Because of the similarity between
K
-means and E-M algorithms, the conver-
gence of the
K
-means algorithm can be studied through the convergent property
of the E-M algorithm. Xu and Jordan give the details of the convergence of the
E-M algorithm [32].
j
(
x
i
−
5.3.3.
Hierarchical Clustering
Hierarchical clustering has two directions: agglomerative and divisive hierarchi-
cal clustering. They work in two opposite directions as follows. Agglomerative
hierarchical clustering initially takes each object as a single cluster. At each step,
it combines two clusters with the maximum similarity or the minimum dissimi-
larity into one. The algorithm stops when all objects are assigned into the same
Search WWH ::
Custom Search