Regression Model Assessment
Learn how to assess the regression model, why we need different metrics, to build better models?
In machine learning, we create different models (supervised learning models, unsupervised learning, recommenders, etc.). We define a dataset, choose an algorithm with parameters, and train the algorithms. After this, the trained model is ready to make predictions. Now the question is how we make sure that our model is good. What are the different ways to assess a model’s performance? In this lesson, we will learn about the standard techniques for model assessment.
Need for different metrics
In classification problems, we define our classes. Our model has to choose one from the availability of need to give the probability of each class for a test record. But how can we make sure that our model is predicting correctly? One simple way is to use accuracy which is the percentage of correctly classified instances on the test data. But is that metric enough? The answer is no.
Think of a problem in medical disease analysis. Suppose we start conducting a test for flu spreading in a particular area. Our dataset contains records of previously analyzed patients and the doctor’s comments, noting whether the patient was infected or not. We have to create a flu prediction system. The person will provide their test observations, and our system should predict whether they are infected or not.
Data = Tests observations + result.
Now, in most of the previous cases, the person was not infected. Say that of 10,000 records, only 50 people were infected.
10,000 People = 9,950 Not infected + 50 infected.
So, we take this dataset and build multiple prediction models. Now, before putting data into our system, we test it with the doctor’s comment. We test with 1,000 new records. Suppose these follow the same distribution. Out of these 1,000, only 5 are infected and the remaining 995 are not infected.
First Model(M1): Predicts 950 as not infected and 50 as infected. (These 50 cover the 5 actual infected persons)
Second Model(M2): Predicts 1000 as not infected.
Third Model(M3): Predicts 990 as not infected and 10 as infected. (These 10 cover the 2 infected persons).
Accuracy of M1: (950+5)/1000 = 0.955
Accuracy of M2: (995)/1000 = 0.995
Accuracy of M3: (982+2)/1000 = 0.984
So, which model should we choose? If we go by accuracy, we should always choose M2. But is it correct? No.
If we always have to choose a person who is not infected, then there is no need for the prediction system and or performance testing.
Our objective here is not to get the most accurate system but to identify the possible infected persons so that they can be filtered out from the rest, and doctors can perform further testing.
Hence, accuracy is not the best way to tell the model’s performance. We need some other measures that give a high score for M1 compared to others.
Multiple metrics exist for regression. We can choose from these or define our own matrix for quality assessment. After selecting the matrix, we can optimize our model by focusing on that score only.
Here is another example:
Consider the classification problem. We have created two linear classifiers. One is represented with a yellow line and another with a black line. We define two types of scores: S1 and S2. For one classifier, S1 is higher, and for the other, S2 is higher. Depending on our requirements, (i.e., whether we want a higher S1 or a higher S2), we can choose one.
Now, you understand the need for different matrices for module evaluation. So, let’s go deeper with different evaluation matrices that exist for regression models.
Mean squared error
Mean squared error is the most generalized estimator for the regression problem. It is an average of the squared sum of differences between the actual value and predicted value. Let the total number of observations be N. yi denotes the actual value, and ŷi denotes the predicted value. So, mean squared error is given as:
Quiz: Regression
In a regression model, we are doing quality assessment using MSE. If we want to choose one prediction value for all the records, what would be the best with the lowest MSE?
The largest number from target values
The smallest number from target values
Mean of target values
D) Median of target values
Explanation:
What we want to know is the value of alpha such that MSE is minimum.
...