Database Reference
In-Depth Information
Generating predictions for the Kaggle/
StumbleUpon evergreen classification dataset
We will use our logistic regression model as an example (the other models are used in the
same way):
val dataPoint = data.first
val prediction = lrModel.predict(dataPoint.features)
The following is the output:
prediction: Double = 1.0
We saw that for the first data point in our training dataset, the model predicted a label of 1
(that is, evergreen). Let's examine the true label for this data point:
val trueLabel = dataPoint.label
You can see the following output:
trueLabel: Double = 0.0
So, in this case, our model got it wrong!
We can also make predictions in bulk by passing in an RDD[Vector] as input:
val predictions = lrModel.predict(data.map(lp =>
lp.features))
predictions.take(5)
The following is the output:
Array[Double] = Array(1.0, 1.0, 1.0, 1.0, 1.0)
Search WWH ::




Custom Search