Data Science with R: Decision Trees and Random Forests/

...

Using XGBoost with tidymodels

Build on your knowledge of tidymodels to use the XGBoost algorithm in your machine learning code.

We'll cover the following...

Data preparation
Transforming categories to number
Specifying XGBoost
Training an XGBoost ensemble

Data preparation

The XGBoost algorithm only supports numeric data. For example, the R xgboost package doesn’t recognize R factors, including ordering factor levels. When using the recipes package for preparing data for use with xgboost, we have to follow these steps: One-hot encoding converts each categorical value into a new categorical column and assign a binary value of 1 or 0 to those columns.

Prepare the training data according to best practices using dplyr (e.g, mutate() function) and recipes functions (e.g., step_num2factor()).
Transform categorical predictive features into numeric representations using data preparation functions from the recipes package.

Note: This applies to the predictive features only.

When performing classification, ensure the label is a factor. Each label ...

Welcome to the Course

Supervised Learning

Classification Tree Math

Using Classification Trees in R

Introducing the Bias-Variance Tradeoff

Model Tuning

Model Tuning with tidymodels

Feature Engineering

Regression Trees

The Random Forest Algorithm

Using Random Forests

Gradient Boosting Trees

Continuing Your Journey

Credit Card Fraud Detection using the R Language

Using XGBoost with tidymodels

Data preparation