Regression Trees with tidymodels

Learn how to train a CART regression tree using tidymodels.

We'll cover the following

Training a custom imputation model

This lesson demonstrates how to train CART regression trees using tidymodels by crafting a custom imputation model to predict missing Age feature values in the Titanic training dataset.

Note: Using this imputation model could lead to information leakage (e.g., the model is used before cross-validation). So, it’s for demonstration purposes only. Use the imputation functions of the recipes package in real-world projects.

The following R code trains and visualizes the custom imputation model. Given the model’s purpose is to predict missing Age values, the model is trained only with observations that have Age values.

Imputation models are eventually used with the test dataset. So, imputation models can’t be trained with the original label / target data. In the case of the Titanic dataset, the Survived feature is not used to train imputation models.

The code has the model specified in the recipe() function call with the Age feature being predicted by other features (e.g., Pclass). In terms of the algorithm, the set_mode() function call specifies regression for the model.

Get hands-on with 1200+ tech skills courses.