Using XGBoost with tidymodels

Build on your knowledge of tidymodels to use the XGBoost algorithm in your machine learning code.

Data preparation

The XGBoost algorithm only supports numeric data. For example, the R xgboost package doesn’t recognize R factors, including ordering factor levels. When using the recipes package for preparing data for use with xgboost, we have to follow these steps: One-hot encoding converts each categorical value into a new categorical column and assign a binary value of 1 or 0 to those columns.

  1. Prepare the training data according to best practices using dplyr (e.g, mutate() function) and recipes functions (e.g., step_num2factor()).

  2. Transform categorical predictive features into numeric representations using data preparation functions from the ...