Model Tuning Intuition
Learn how tuning machine learning models is one of the primary ways to avoid overfitting.
We'll cover the following
Combatting overfitting
As a machine learning practitioner, there are two explicit goals when choosing how to train machine learning models:
Optimizing the value of the model’s predictions (e.g., attaining high accuracy).
Ensuring the value of the model’s predictions remains high over time.
There are many options for training machine learning models. Examples later in the course include dividing data into training and test datasets and techniques like cross-validation. The course also covers tuning models as part of their training regimen.
Model tuning and automobiles
Conceptually, tuning machine learning models is like tuning an automobile engine for optimal performance. For example, consider someone living in a warm climate at sea level and then moving to live in a cold environment high in the mountains. How might that impact the automobile’s engine?
In the first case, there is ample warm oxygen for the engine. In the second case, there will be less oxygen and colder oxygen. It’s easy to imagine the engine would need different settings (i.e., be tuned differently) after the move to the new city.
To continue the analogy, the Adult Census Income and Titanic datasets are like the two cities, and the CART classification tree algorithm is like an automobile’s engine. For any trained models to have optimal performance for each dataset, the models will typically need different settings (i.e., the models will need to be tuned differently).
Decision tree tuning example
As we’ll cover later, CART classification trees have many settings that can be tuned to produce the most valuable models. In machine learning, these tunable settings are called hyperparameters. Each machine learning algorithm has a specific set of hyperparameters that can be tuned.
One example of a hyperparameter for CART classification trees is the minimum number of observations required to make leaf nodes. Consider the following example tree trained on only ten observations.
Get hands-on with 1400+ tech skills courses.