Data Science with R: Decision Trees and Random Forests/

...

Regression Tree Basics

Build on your knowledge of CART classification trees to understand CART regression trees.

We'll cover the following...

Regression trees vs. classification trees
Calculating SSE

Regression trees vs. classification trees

The CART algorithm can learn decision tree models that predict numeric values based on a training dataset. Decision tree models that predict numeric values are known as regression trees.

In general, the CART algorithm works the same whether the tree to be trained will be used for classification or regression. However, the calculations used for regression trees are different. While classification trees use Gini-based calculations, regression trees use the sum of squared errors (SSE) calculation.

Regression trees learn by splitting the training data so that the SSE is minimized. This is similar to how classification trees learn by splitting training data to minimize Gini.

When making predictions, regression trees calculate the average of all values in a leaf node. Again, this is similar to the majority rules predictions made by classification trees.

To understand these concepts better, take a hypothetical example of building an imputation model for the Age feature of the Titanic dataset. As the Age feature is numeric, the CART algorithm can be used to build a regression tree to predict Age from the Pclass and Sex features.

The following table represents a leaf node for this hypothetical imputation model:

Pclass	Sex	Age
first	female	23
first	female	7
first	female	61
first	female	37
first	female	11

Welcome to the Course

Supervised Learning

Classification Tree Math

Using Classification Trees in R

Introducing the Bias-Variance Tradeoff

Model Tuning

Model Tuning with tidymodels

Feature Engineering

Regression Trees

The Random Forest Algorithm

Using Random Forests

Gradient Boosting Trees

Continuing Your Journey

Credit Card Fraud Detection using the R Language

Regression Tree Basics

Regression trees vs. classification trees

Imputation Model Leaf Node

Calculating SSE