An Introductory Guide to Data Science and Machine Learning/

...

Model Evaluation Part 1

Part 1 of Model Evaluation is covered in this lesson. Evaluation measures help us accurately judge the model’s performance.

We'll cover the following...

Model evaluation measures

Confusion Matrix
Accuracy
Precision
Recall
Trade-off between Precision and Recall
F1-Score
Performing Evaluation for Multi-Class Classification

Model evaluation measures

In this lesson, we will observe model evaluation for classification. We predict a discrete-valued output and it has its own evaluation measures.

Confusion Matrix

The confusion matrix serves as an evaluation measure for classification problems. The confusion matrix for a binary classification problem having only two class labels (0 and 1) would look like this.

	Actual Class Label (1)	Actual Class Label (0)
Predicted Class Label (1)	TP	FP
Predicted Class Label (0)	FN	TN

Here

TP(True Positive): We predicted a positive Class Label (1) and it is True.

TN(True Negative): We predicted a negative Class Label (0) and it is True.

FP(False Positive): It is also called Type 1 Error. We predicted a positive Class Label (1) and it is False.

FN(False Negative): It is also called Type 2 Error. We predicted a negative Class Label (0) and it is True.

Note that positive and negative refer to the predicted class labels whereas True and False refer to the actual class labels.

This confusion matrix is the basis of many evaluation measures we are calculating below.

Press + to interact

Python 3.5

def confusion_matrix_calculation(y_true, y_pred):
    """
    This function calculates the following values (TP, FP, TN, FN).
    It takes in y_true (which are actual class labels) and y_pred(which 
    are predicted class labels).
    """
    TP = 0
    FP = 0
    TN = 0
    FN = 0
    for i in range(len(y_pred)): 
        if y_true[i]==y_pred[i]==1:          
           TP += 1
        if y_pred[i]==1 and y_true[i]!=y_pred[i]:
           FP += 1
        if y_true[i]==y_pred[i]==0:
           TN += 1
        if y_pred[i]==0 and y_true[i]!=y_pred[i]:
           FN += 1
    return(TP, FP, TN, FN)
y_true = [1, 0, 1, 0, 0, 1]
y_pred = [0, 0, 1, 1, 0, 1]
TP, FP, TN, FN = confusion_matrix_calculation(y_true, y_pred)
print("The True Class Labels are {}.".format(y_true))
print("The Predicted Class Labels are {}.".format(y_pred))
print("The number of True Positives are {}.".format(TP))
print("The number of False Positives are {}.".format(FP))
print("The number of False Negatives are {}.".format(FN))
print("The number of True Negatives are {}.".format(TN))

Line 1 defines the function to calculate the values inside a confusion matrix.
Lines 12 to 20 calculates the values which are present inside a confusion matrix. Things to be learned are the logic used to calculate the values.
On Line 27 and Line 28, we print the actual and the predicted class labels.
From Line 29 to Line 32 we are printing the respective statistics.

Accuracy

Accuracy is calculated as the ratio of the correctly classified instances and the total number of instances.

$Accuracy = \frac{Number of Correct Predictions}{Total Number of Predictions}$ ...

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Model Evaluation Part 1

Model evaluation measures

Confusion Matrix

Accuracy