Database Reference
In-Depth Information
Training a clustering model
Training for K-means in MLlib takes an approach similar to the other models—we pass an
RDD that contains our training data to the
train
method of the
KMeans
object. Note that
here we do not use
LabeledPoint
instances, as the labels are not used in clustering;
they are used only in the feature vectors. Thus, we use a RDD
[Vector]
as input to the
train
method.