What is the f1_score function in Sklearn?

Overview

In Python, the f1_score function of the sklearn.metrics package calculates the F1 score for a set of predicted labels.

The F1 score is the harmonic mean of precision and recall, as shown below:

F1_score = 2 * (precision * recall) / (precision + recall)

An F1 score can range between 010-1, with 0 being the worst score and 1 being the best.

To use the f1_score function, we’ll import it into our program, as shown below:

from sklearn.metrics import f1_score

Syntax

sklearn.metrics.f1_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn')

Parameters

The f1_score function accepts the following parameters:

  • y_true: These are the true labels.

  • y_pred: These are the predicted labels.

  • labels: This parameter identifies the labels to be included when there is a multiclass problem.

  • pos_label: This is the class to report in case of a binary classification problem.

  • average: This is the type of averaging to be performed in the case of multiclass data.

  • sample_weight: These are any sample weights to be used in the calculation of the F1 score.

Note: Find a comprehensive list of parameters and their possible values here.

Return value

This function returns the F1 score of the positive class for binary classification problems or the weighted average of the F1 scores of each class for multiclass problems.

Example

from sklearn.metrics import f1_score
# define true labels
true_labels = ["a", "c", "b", "a"]
# define corresponding predicted labels
pred_labels = ["c", "c", "b", "a"]
# find f1 scores for different weighted averages
score = f1_score(true_labels, pred_labels, average="macro")
print("Macro F1-Score: ", score)
score = f1_score(true_labels, pred_labels, average="micro")
print("Micro F1-Score: ", score)
score = f1_score(true_labels, pred_labels, average="weighted")
print("Weighted F1-Score: ", score)

Explanation

  • Line 1: We import the f1_score function from the sklearn.metrics library.

  • Lines 4–7: We define the true labels and predicted labels. As there are 3 classes (a, b, c), this is a multiclass problem.

  • Line 11: We calculate the macro-average of the predicted classes through the F1_score function. The calculated score is output accordingly.

  • Line 14: We calculate the micro-average of the predicted classes through the F1_score function. The calculated score is output accordingly.

  • Line 17: We calculate the weighted average of the predicted classes through the F1_score function. The calculated score is output accordingly.