Exercise: Find Optimal Hyperparameters for a Decision Tree

Learn to find the optimal maximum depth hyperparameter for a decision tree by using the grid search method.

We'll cover the following...

Using GridSearchCV to tune hyperparameters
Comparison between models
Try it yourself

Using GridSearchCV to tune hyperparameters

In this exercise, we will use GridSearchCV to tune the hyperparameters for a decision tree model. You will learn about a convenient way of searching different hyperparameters with scikit-learn. Perform the following steps to complete the exercise:

Import the GridSearchCV class with this code:
```
from sklearn.model_selection import GridSearchCV
```
The next step is to define the hyperparameters that we want to search using cross-validation. We will find the best maximum depth of tree, using the max_depth parameter. Deeper trees have more node splits, which partition the training set into smaller and smaller subspaces using the features. While we don’t know the best maximum depth ahead of time, it is helpful to consider some limiting cases when considering the range of parameters to use for the grid search.

We know that one is the minimum depth, consisting of a tree with just one split. As for the largest depth, you can consider how many samples you have in your training data, or, more appropriately in this case, how many samples will be in the training fold for each split of the cross-validation. We will perform a 4-fold cross-validation like we did in the previous section. So, how many samples will be in each training fold, and how does this relate to the depth of the tree?
Find the number of samples in the training data using this code:
```
X_train.shape
```
The output should be as follows:
```
(21331, 17)
```
With 21,331 training samples and 4-fold cross-validation, there will be three-fourths of the samples, or about 16,000 samples, in each training fold.

What does this mean for how deep we may wish to grow our tree?

A theoretical limitation is that we need at least one sample in each leaf. From our discussion regarding how the depth of the tree relates to the number of leaves, we know a tree that splits at every node before the last level, with $n$ levels, has $2n$ ...

Introduction

Data Exploration and Cleaning

(Challenge) Exploring Remaining Financial Features in Dataset

Introduction to scikit-learn and Model Evaluation

Fake News Detection Using Scikit-learn

(Challenge) Logistic Regression and Precision-Recall Curve

Details of Logistic Regression and Feature Extraction

(Challenge) Logistic Regression Model and Coefficients

The Bias-Variance Trade-Off

(Challenge) Cross-Validation and Feature Engineering

Decision Trees and Random Forests

(Challenge) Cross-Validation Grid Search with Random Forest

Gradient Boosting, XGBoost, and SHAP Values

(Challenge) XGBoost and SHAP Explanation for Case Study Data

Predict Frog Toxicity with Python and XGBoost

Test Set Analysis, Financial Insights, and Delivery to the Client

(Challenge) Deriving Financial Insights

Appendix

Exercise: Find Optimal Hyperparameters for a Decision Tree

Using GridSearchCV to tune hyperparameters