Training and Testing

Separate a dataset into training and testing sets.

Chapter Goals:

  • Learn about splitting a dataset into training and testing sets

A. Training and testing sets

We've discussed in depth how to fit a model on data and labels. However, once we fit the model, how do we evaluate it? It is a bad idea to evaluate a model solely on the same dataset it was fitted on, because the model's parameters are already tuned for that dataset. Instead, we need to split the original dataset into two datasets: one for training and one for testing.

The training set is used for fitting the model on data (i.e. training the model), while the testing set is ...