The difference between the average value predicted by our Machine Learning model and the correct target value is known as Bias. A model that makes incorrect predictions about a dataset is called a biased model. This model oversimplifies the target function to make it easier to learn.
Underfitting: A model with High Bias tends to underfit the data as it oversimplifies the solution by failing to learn how to train the data efficiently. This results in a linear function.
Oversimplification: Due to the model being too simple, the biased model is unable to learn complex features of a training data, thus, making it inefficient when solving complex problems.
Low Training Accuracy: Due to the inability to correctly process training data, the biased model shows high-training loss resulting in low-training accuracy.
High bias happens because of a high training error. There are multiple ways to reduce the bias of a model, such as:
The amount of variability in the target function in response to a change in the training data is known as Variance. When a model takes into consideration the noise and fluctuation in the data, it is said to be of High Variance.
Overfitting: A model with High Variance tends to overfit the data as it overcomplicates the solution and fails to generalize new test data. This results in a non-linear function.
Overcomplication: Due to the model being too complex, the model learns a much more complex curve and fails to work efficiently on simple problems.
Low Testing Accuracy: Although these models tend to work well on training data with high accuracy, they fail to efficiently work on test data where they will show a huge test data loss.
High variance is due to a high validation error. There are multiple ways of reducing the variance of a model such as:
As seen above, if the algorithm is too simple, it will have a high bias and a low variance. Similarly, if the algorithm is too complex, it will have a high variance and a low bias. Therefore, it is clear that:
“Bias and variance are complements of each other” The increase of one will result in the decrease of the other and vice versa. Hence, finding the right balance of values is known as the Bias-Variance Tradeoff.
An ideal algorithm should neither underfit nor overfit the data. The end goal of all Machine Learning Algorithms is to produce a function that has both low-bias and low-variance.
Hypothetically, the dotted line above is the required optimal solution. But, in the real world, it is very difficult to achieve due to an unknown best target function. The goal is to find an iterative process through which we can keep on improving our Machine Learning Algorithm so that its predictions will improve.