...

/

XGBoost Hyperparameters: Early Stopping

XGBoost Hyperparameters: Early Stopping

Learn how early stopping can reduce the overfitting of random forest model trained with XGBoost.

Early stopping as a method for reducing overfitting

When training ensembles of decision trees with XGBoost, there are many options available for reducing overfitting and leveraging the bias-variance trade-off. Early stopping is a simple one of these and can help provide an automated answer to the question “How many boosting rounds are needed?” It’s important to note that early stopping relies on having a separate validation set of data, aside from the training set. However, this validation set will actually be used during the model training process, so it does not qualify as “unseen” data that was held out from model training, similar to how we used validation sets in cross-validation to select model hyperparameters in the chapter “The Bias-Variance Trade-Off.”

When XGBoost is training successive decision trees to reduce error on the training set, it’s possible that adding more and more trees to the ensemble will provide increasingly better fits to the training data, but start to cause lower performance on held-out data. To avoid this, we can use a validation set, also called an evaluation set or eval_set by XGBoost. The evaluation set will be supplied as a list of tuples of features and their corresponding response variables. Whichever tuple comes last ...

Access this course and 1400+ top-rated courses and projects.