Data Science with R: Decision Trees and Random Forests/

...

Performing Cross-Validation

Learn how to use cross-validation to calculate accuracy estimates using tidymodels.

We'll cover the following...

Coding the workflow
Setting up cross-validation
Tracking model accuracy
Performing cross-validation
Evaluating accuracy

Press + to interact

#================================================================================================
# Load libraries - suppress messages
#
suppressMessages(library(tidyverse))
suppressMessages(library(tidymodels))
suppressMessages(library(rattle))
#================================================================================================
# Load the Titanic training data and transform Embarked to a factor
#
titanic_train <- read_csv("titanic_train.csv", show_col_types = FALSE) %>%
  mutate(Sex = factor(Sex),
         Embarked = factor(case_when(
           Embarked == "C" ~ "Cherbourg",
           Embarked == "Q" ~ "Queenstown",
           Embarked == "S" ~ "Southampton",
           is.na(Embarked) ~ "missing")))
#================================================================================================
# Craft the recipe - recipes package
#
titanic_recipe <- recipe(Survived ~ Sex + Pclass + SibSp + Parch + Fare + Embarked, data = titanic_train) %>%
  step_num2factor(Survived,
                  transform = function(x) x + 1,
                  levels = c("perished", "survived")) %>%
  step_num2factor(Pclass,
                  levels = c("first", "second", "third"))
#================================================================================================
# Specify the algorithm - parsnip package
#
titanic_model <- decision_tree() %>%
  set_engine("rpart") %>%
  set_mode("classification")
#================================================================================================
# Set up workflow - workflow package
#
titanic_workflow <- workflow() %>%
  add_recipe(titanic_recipe) %>%
  add_model(titanic_model)

Welcome to the Course

Supervised Learning

Classification Tree Math

Using Classification Trees in R

Introducing the Bias-Variance Tradeoff

Model Tuning

Model Tuning with tidymodels

Feature Engineering

Regression Trees

The Random Forest Algorithm

Using Random Forests

Gradient Boosting Trees

Continuing Your Journey

Credit Card Fraud Detection using the R Language

Performing Cross-Validation

Coding the workflow

Setting up cross-validation