...

Machine Learning

Train single and bagged decision trees along with random forest trees and evaluate.

We'll cover the following...

Single decision tree
Prediction and evaluation
Bagged decision trees
Random forest

Press + to interact

We use a single decision tree, but we see that the model is mislabeling some. We also know that decision trees can be very easy to overfit, limiting generalization and leading to poor performance on unseen data.

Bagged decision trees

We learned about bagging (Bootstrap aggregation) as a general-purpose procedure for reducing the high variance. So, if we opt for bagged decision trees, they are expected to perform better than a single decision tree. However, due to their structural similarities, they are still strongly correlated in their predictions. The random forest method is always preferred and recommended over the single or even the bagged trees method. Let’s try bagged trees and then move on to the random forest for comparisons.

Press + to interact

# import required for bagging
from sklearn.ensemble import BaggingClassifier
#creating instance for bagging and passing dtree classifier along with other parameters
base_estimator = DecisionTreeClassifier(criterion='entropy') # base estimator for BaggingClassifier
bagged_trees = BaggingClassifier(
    base_estimator=base_estimator,
    n_estimators=5,# number of trees we want, try different numbers
    bootstrap=True, # default value
    bootstrap_features=True,# in-case we want to bootstrap features as well
    max_features=8, # how many maximum number of features we want in each bootstrapped sample
    random_state=42) # ensure reproducible results
bagged_trees.fit(X_train, y_train) #fitting/training

Course Introduction

Linear Regression

Regularization

Bias-Variance Trade-off

Categorical Features

Logistic Regression

Logistic Regression: Titanic Data

Sentiment Analysis Using Multinomial Logistic Regression

Multiclass Classification and Handling Imbalanced Classes

Project: Predicting Chronic Kidney Disease

K-Nearest Neighbors

Implementation of K-Nearest Neighbors

Logistic Regression vs. KNN

Decision Tree Learning

Implement the Decision Tree Classifier from Scratch

Bootstrapping and Confidence Interval

Support Vector Machine

Practice and Comparisons

What's Next?

Appendix

Machine Learning

Single decision tree

Prediction and evaluation

Bagged decision trees