As we saw in the previous exercise, decision trees are prone to overfitting. This is one of the principal criticisms of their usage, despite the fact that they are highly interpretable. However, we were able to limit this overfitting, to an extent, by limiting the maximum depth to which the tree could be grown.

Concept behind random forests

Building on the concepts of decision trees, machine learning researchers have leveraged multiple trees as the basis for more complex procedures, resulting in some of the most powerful and widely used predictive models. In this section, we will focus on random forests of decision trees. Random forests are examples of what are called ensemble models, because they are formed by combining other, simpler models. By combining the predictions of many models, it is possible to improve upon the deficiencies of any given one of them. This is sometimes called combining many weak learners to make a strong learner.

Get hands-on with 1200+ tech skills courses.