Using XGBoost with tidymodels
Explore how to implement the XGBoost algorithm using tidymodels in R. Learn to prepare data with one-hot encoding for categorical features, specify an XGBoost model, and train it with default hyperparameters. Understand key steps for transforming data and optimizing Gradient Boosting Trees for classification tasks.
We'll cover the following...
Data preparation
The XGBoost algorithm only supports numeric data. For example, the R xgboost package doesn’t recognize R factors, including ordering factor levels. When using the recipes package for preparing data for use with xgboost, we have to follow these steps:
Prepare the training data according to best practices using
dplyr(e.g,mutate()function) andrecipesfunctions (e.g.,step_num2factor()).Transform categorical predictive features into numeric representations using data preparation functions from the
recipespackage.
Note: This applies to the predictive features only.
When performing classification, ensure the label is a factor. Each label ...