Building a Classification Model with Spark - Machine Learning with Spark

Database Reference

In-Depth Information

Training classification models

Now that we have extracted some basic features from our dataset and created our input

RDD, we are ready to train a number of models. To compare the performance and use of

different models, we will train a model using logistic regression, SVM, naïve Bayes, and a

decision tree. You will notice that training each model looks nearly identical, although each

has its own specific model parameters that can be set. MLlib sets sensible defaults in most

cases, but in practice, the best parameter setting should be selected using evaluation tech-

niques, which we will cover later in this chapter.

Search WWH ::

Custom Search

Home