Database Reference
In-Depth Information
Using MLlib's built-in evaluation functions
While we have computed MSE, RMSE, and MAPK from scratch, and it a useful learning
exercise to do so, MLlib provides convenience functions to do this for us in the Regres-
sionMetrics and RankingMetrics classes.
RMSE and MSE
First, we will compute the MSE and RMSE metrics using RegressionMetrics . We
will instantiate a RegressionMetrics instance by passing in an RDD of key-value
pairs that represent the predicted and true values for each data point, as shown in the fol-
lowing code snippet. Here, we will again use the ratingsAndPredictions RDD we
computed in our earlier example:
import org.apache.spark.mllib.evaluation.RegressionMetrics
val predictedAndTrue = ratingsAndPredictions.map { case
((user, product), (predicted, actual)) => (predicted,
actual) }
val regressionMetrics = new
RegressionMetrics(predictedAndTrue)
We can then access various metrics, including MSE and RMSE. We will print out these
metrics here:
println("Mean Squared Error = " +
regressionMetrics.meanSquaredError)
println("Root Mean Squared Error = " +
regressionMetrics.rootMeanSquaredError)
You will see that the output for MSE and RMSE is exactly the same as the metrics we com-
puted earlier:
Mean Squared Error = 0.08231947642632852
Root Mean Squared Error = 0.2869137090247319
MAP
As we did for MSE and RMSE, we can compute ranking-based evaluation metrics using
MLlib's RankingMetrics class. Similarly, to our own average precision function, we
Search WWH ::




Custom Search