Model Performance Metrics for Binary Classification

Learn about the metrics for binary classification used to assess the predictive quality of a model.

Before we start building predictive models in earnest, we would like to know how we can determine, once we’ve created a model, whether it is “good” in some sense of the word. As you may imagine, this question has received a lot of attention from researchers and practitioners. Consequently, there is a wide variety of model performance metrics to choose from.

Note: For an idea of the range of options, have a look at the scikit-learn model evaluation page.

When selecting a model performance metric to assess the predictive quality of a model, it’s important to keep the following two things in mind.

  • The appropriateness of the metric for the problem
  • Whether or not the metric answers the business question

Appropriateness of the metric for the problem

Metrics are typically only defined for a specific class of problems, such as classification or regression. For a binary classification problem, several metrics characterize the correctness of the yes or no question that the model answers. An additional level of detail here is how often the model is correct for each class, the positive and negative classes. We will go into detail on these metrics here. On the other hand, regression metrics are aimed at measuring how close a prediction is to the target quantity. If we are trying to predict the price of a house, how close did we come? Are we systematically over-estimating or under-estimating? Are we getting the more expensive houses wrong but the cheaper ones right? There are many possible ways to look at regression metrics.

Get hands-on with 1200+ tech skills courses.