The machine learning model provides likely outcomes of a question based on historical data. Using loss functions, we can measure how far a predicted value is from its actual value. Mean square error is one of the commonly used loss functions.
To understand this theory, let’s consider a plane with an X and Y-axis.
In the diagram above:
As we can see, the predicted value is away from the actual value, and is represented by x8 and y8’. This indicates we have a loss, and it is calculated by y8-y8’. This provides us the difference between the predicted and actual values at x8. We can do this for all of the data points.
Let’s consider another example with the X and Y-axis and two data points.
In the diagram above:
To calculate the error, we perform the following mathematical operations.
We need to add them to calculate the total loss, i.e., (y1-y1’) + (y2-y2’) ~ 0. From the diagram, we can see that the red line is equally away from both points. So when we add them, we get answers of approximately zero. This is incorrect because we have a fair amount of loss in our function.
To avoid this scenario, we square the values:
+ >> 0
Now we see that loss is greater than zero. This is because the purple line doesn’t fit the points well. Hence, the loss should be greater than zero.
Now we take the mean of the results as we divide by the number of points.
+ >> 0
The generic equation looks like the following:
This is exactly why the loss function is called mean square error.