Database Reference
In-Depth Information
Cross-validation
So far in this topic, we have only briefly mentioned the idea of cross-validation and out-of-
sample testing. Cross-validation is a critical part of real-world machine learning and is
central to many model selection and parameter tuning pipelines.
The general idea behind cross-validation is that we want to know how our model will per-
form on unseen data. Evaluating this on real, live data (for example, in a production sys-
tem) is risky, because we don't really know whether the trained model is the best in the
sense of being able to make accurate predictions on new data. As we saw previously with
regard to regularization, our model might have over-fit the training data and be poor at
making predictions on data it has not been trained on.
Cross-validation provides a mechanism where we use part of our available dataset to train
our model and another part to evaluate the performance of this model. As the model is
tested on data that it has not seen during the training phase, its performance, when evalu-
ated on this part of the dataset, gives us an estimate as to how well our model generalizes
for the new data points.
Here, we will implement a simple cross-validation evaluation approach using a train-test
split. We will divide our dataset into two non-overlapping parts. The first dataset is used to
train our model and is called the training set. The second dataset, called the test set or hold-
out set, is used to evaluate the performance of our model using our chosen evaluation
measure. Common splits used in practice include 50/50, 60/40, and 80/20 splits, but you
can use any split as long as the training set is not too small for the model to learn (gener-
ally, at least 50 percent is a practical minimum).
In many cases, three sets are created: a training set, an evaluation set (which is used like the
above test set to tune the model parameters such as lambda and step size), and a test set
(which is never used to train a model or tune any parameters, but is only used to generate
an estimated true performance on completely unseen data).
Note
Here, we will explore a simple train-test split approach. There are many cross-validation
techniques that are more exhaustive and complex.
One popular example is K-fold cross-validation, where the dataset is split into K non-over-
lapping folds. The model is trained on K-1 folds of data and tested on the remaining, held-
Search WWH ::




Custom Search