Tuning XGBoost with tidymodels

Learn how to tune the XGBoost algorithm using tidymodels.

XGBoost hyperparameters

tidymodels supports many gradient boosting algorithms in addition to XGBoost. To promote a common programming interface, tidymodels supports tuning many hyperparameters via the tune() function. The supported hyperparameters for the xgboost package are:

Supported xgboost Hyperparameters

Hyperparameter

Description

Default Value

mtry

The number of randomly selected predictive features used

The count of predictive features

trees

The number of trees used in the ensemble

15

min_n

The minimum number of observations required to make a split

1

tree_depth

The maximum depth (i.e., number of splits) of ensemble trees

6

learn_rate

The weight applied to each ensemble tree's predictions

0.3

loss_reduction

The minimum amount by which a new tree must improve the ensemble for the algorithm to continue

0.0

sample_size

The proportion of data sampled to train each ensemble tree (a value of 1.0 denotes 100 percent of the data)

1.0

stop_iter

The number of iterations before stopping

Infinite

The above list of hyperparameters supported by tidymodels is a subset of what is supported natively by the xgboost package. The above list is selected for both commonality across gradient boosting algorithms and usefulness in training gradient boosted ensembles.

The xgboost hyperparameters not listed above can be set directly via the set_engine() function by specifying the hyperparameter name and the value to set for the hyperparameter. The following code demonstrates setting the verbose ...