Decision Trees and Ensembles

Discover why single decision trees are not commonly used in the real world and why machine learning ensembles are considered state-of-the-art.

Decision trees have high variance

Under the bias-variance tradeoff, a certain amount of variance (i.e., complexity) must be added to craft the most valuable machine learning models. Not enough variance leads to models that underfit, and too much variance leads to overfitting models.

Machine learning algorithms differ in the amount of variance that can be introduced. For example, many classic statistical algorithms (e.g., logistic regression) have relatively low variance and can have difficulty learning complex decision boundaries.

By way of comparison, CART decision trees are a high-variance algorithm. This high variance manifests in practical form as the CART algorithm produces radically different models by changing the training data or changing hyperparameter values.

An example of high variance

The best way to build an understanding of the CART algorithm’s high variance is through an example. The following table represents a sample of data from the Adult Census Income dataset:

Get hands-on with 1300+ tech skills courses.