Database Reference
In-Depth Information
Tuning model parameters
The previous section showed the impact on model performance of feature extraction and
selection, as well as the form of input data and a model's assumptions around data distribu-
tions. So far, we have discussed model parameters only in passing, but they also play a sig-
nificant role in model performance.
MLlib's default train methods use default values for the parameters of each model. Let's
take a deeper look at them.
Linear models
Both logistic regression and SVM share the same parameters, because they use the same
underlying optimization technique of stochastic gradient descent ( SGD ). They differ only
in the loss function applied. If we take a look at the class definition for logistic regression
in MLlib, we will see the following definition:
class LogisticRegressionWithSGD private (
private var stepSize: Double,
private var numIterations: Int,
private var regParam: Double,
private var miniBatchFraction: Double)
extends
GeneralizedLinearAlgorithm[LogisticRegressionModel] ...
We can see that the arguments that can be passed to the constructor are stepSize ,
numIterations , regParam , and miniBatchFraction . Of these, all except
regParam are related to the underlying optimization technique.
The instantiation code for logistic regression initializes the Gradient , Updater , and
Optimizer and sets the relevant arguments for Optimizer ( GradientDescent in
this case):
private val gradient = new LogisticGradient()
private val updater = new SimpleUpdater()
override val optimizer = new GradientDescent(gradient,
updater)
.setStepSize(stepSize)
.setNumIterations(numIterations)
Search WWH ::




Custom Search