Building a Clustering Model with Spark - Machine Learning with Spark

Database Reference

In-Depth Information

Making predictions using a clustering

model

Using the trained K-means model is straightforward and similar to the other models we

have encountered so far, such as classification and regression. We can make a prediction

for a single Vector instance as follows:

val movie1 = movieVectors.first

val movieCluster = movieClusterModel.predict(movie1)

println(movieCluster)

We can also make predictions for multiple inputs by passing a RDD [Vector] to the

predict method of the model:

val predictions = movieClusterModel.predict(movieVectors)

println(predictions.take(10).mkString(","))

The resulting output is a cluster assignment for each data point:

0,0,1,1,2,1,0,1,1,1

Tip

Note that due to random initialization, the cluster assignments might change from one run

of the model to another, so your results might differ from those shown earlier. The cluster

ID themselves have no inherent meaning; they are simply arbitrarily labeled, starting from

0.

Search WWH ::

Custom Search

Home