Database Reference
In-Depth Information
Initialization methods
The standard initialization method for K-means, usually simply referred to as the random
method, starts by randomly assigning each data point to a cluster before proceeding with
the first update step.
MLlib provides a parallel variant for this initialization method, called K-means ||, which is
the default initialization method used.
MLlib provides a parallel variant called K-means || , || , for this initialization method; this
is the default initialization method used.
Note
See http://en.wikipedia.org/wiki/K-means_clustering#Initialization_methods and ht-
tp://en.wikipedia.org/wiki/K-means%2B%2B for more information.
The results of using K-means++ are shown here. Note that this time, the difficult lower-
right points have been mostly correctly clustered.
Search WWH ::




Custom Search