Tuning a Classification Tree
Build on cross-validation by learning to tune a CART classification decision tree using tidymodels.
Preparing the data
The following code prepares the Titanic training data as part of a tidymodels
workflow.
Press + to interact
#================================================================================================# Load libraries - suppress messages#suppressMessages(library(tidyverse))suppressMessages(library(tidymodels))suppressMessages(library(rattle))#================================================================================================# Load the Titanic training data and transform Embarked to a factor#titanic_train <- read_csv("titanic_train.csv", show_col_types = FALSE) %>%mutate(Sex = factor(Sex),Embarked = factor(case_when(Embarked == "C" ~ "Cherbourg",Embarked == "Q" ~ "Queenstown",Embarked == "S" ~ "Southampton",is.na(Embarked) ~ "missing")))#================================================================================================# Craft the recipe - recipes package#titanic_recipe <- recipe(Survived ~ Sex + Pclass + SibSp + Parch + Fare + Embarked, data = titanic_train) %>%step_num2factor(Survived,transform = function(x) x + 1,levels = c("perished", "survived")) %>%step_num2factor(Pclass,levels = c("first", "second", "third"))
Configuring the model for tuning
The parsnip
package supports a variety of methods for tuning tidymodels
workflow for hyperparameter tuning occurs when the algorithm is specified. The parsnip
package’s decision_tree()
function supports tuning of the following hyperparameters:
cost_complexity
: A positive number for the cost/complexity parameter (akacp
) used by therpart
package.tree_depth
: A positive integer for the maximum depth of the tree. ...