Scikit-learn is a Python-focused library mainly utilized in machine learning tasks, including classification, regression, clustering, and model selection.
Note: To get hands-on practice in Scikit-learn, you can explore the course
While training a model we've built from scratch, it is crucial to focus on its accuracy. Various metrics can be used to specify how well a model is trained. In this Answer, we will be covering one such metric called the mean absolute error.
We can confirm the accuracy of a trained model by passing validation data to it and observing the results. Here, we compare our predicted and actual values for the data, and this is precisely where the mean absolute error comes in handy.
Mean absolute error, abbreviated as MAE, is a metric used to measure the average absolute difference between the predicted and actual values in a regression problem i.e. modeling various relations of variables. In deep learning, MAE is used as a
Simply put, MAE tells us how much our predictions are off from the actual values in the dataset. It helps us understand the accuracy of our model by measuring the absolute errors between predicted and true values.
A lower MAE means our model's predictions are closer to the actual values. This indicates better performance.
To calculate MAE, we take the absolute difference between each predicted value and its corresponding actual value. Then, we add up all these absolute differences and divide the sum by the total number of data points.
The manual calculation of the mean absolute error is represented by the code below.
actual_values = [25, 30, 20, 35, 41]predicted_values = [23, 28, 19, 33, 38]n = len(actual_values)absolute_differences = [abs(actual_values[i] - predicted_values[i]) for i in range(n)]sum_of_absolute_differences = sum(absolute_differences)mae = sum_of_absolute_differences / nprint("Actual values = ", actual_values)print("Predicted values = ", predicted_values)print("Absolute differences = ", absolute_differences)print("Sum of absolute difference = ", sum_of_absolute_differences)print("The MAE = ", mae)
First off, we begin by defining sample data with actual_values
and predicted_values
,
Next, we calculate the number of data points n
in our dataset.
We then calculate the absolute differences between both values for each data point using a list comprehension. abs()
is used to get the absolute value of these differences.
Then, we sum these differences using sum()
.
Finally, we calculate the MAE by dividing the sum by the number of data points n
.
We print our variables to understand the code better.
from sklearn.metrics import mean_absolute_errory_true = [3.5, 2.1, 5.2, 7.8, 4.6]y_pred = [3.0, 2.5, 4.8, 8.0, 5.2]mae = mean_absolute_error(y_true, y_pred)print("MAE = ", mae)
Line 1: We import the mean_absolute_error
method from the sklearn.metrics
module.
Lines 4–5: We define our target i.e. true
and predicted pred
values.
Line 8: We make use of the pre-built function to calculate MAE and make the code compact for us.
Line 9: Finally, we print the calculated error.
abs()
function is used to
find how well the data has been trained
MAE is used to
calculate the absolute value of the variable passed to it