Database Reference
In-Depth Information
14/09/02 21:53:58 INFO KMeans: KMeans reached the max
number of iterations: 10.
14/09/02 21:53:58 INFO KMeans: The cost for the best run is
2586.298785925147
.
...
movieClusterModel:
org.apache.spark.mllib.clustering.KMeansModel =
org.apache.spark.mllib.clustering.KMeansModel@71c6f512
As can be seen from the highlighted text, the model training output tells us that the max-
imum number of iterations was reached, so the training process did not stop early based
on the convergence criterion. It also shows the training set error (that is, the value of the
K-means objective function) for the best run.
We can try a much larger setting for the maximum iterations and use only one training run
to see an example where the K-means model converges:
val movieClusterModelConverged = KMeans.train(movieVectors,
numClusters, 100)
You should be able to see the KMeans converged in ... iterations text in
the model output; this text indicates that after so many iterations, the K-means objective
function did not decrease more than the tolerance level:
...
14/09/02 22:04:38 INFO SparkContext: Job finished:
collectAsMap at KMeans.scala:193, took 0.040685 s
14/09/02 22:04:38 INFO KMeans: Run 0 finished in 34
iterations
14/09/02 22:04:38 INFO KMeans: Iterations took 0.812
seconds.
14/09/02 22:04:38 INFO KMeans: KMeans converged in 34
iterations.
14/09/02 22:04:38 INFO KMeans: The cost for the best run is
2584.9354332904104.
...
movieClusterModelConverged:
Search WWH ::




Custom Search