...

/

Introduction to Logistic Regression

Introduction to Logistic Regression

Learn about classification problems and logistic regression.

In this chapter of the course, we’ll discuss:

  • The classification problem and logistic regression to find the answer to our problem.

  • How to interpret the results from logistic regression through the confusion matrix.

Overview

In the classification problems, rather than predicting a continuous or quantitative output value (that is, today’s stock price, house price, and others), we are interested in nonnumerical value, a categorical or qualitative output (that is, the stock index increase or decrease). In this section, our focus will be on learning logistic regression as a method for classification. The logistic regression model is one of the most widely used machine learning algorithms for binary classification problems.

Examples of binary classification

The convention for binary classification is to have two classes, 0 and 1, like the followings:

  • Win or loss

  • Pass or fail

  • Dead or alive

  • Spam or ham email

  • Insurance or loan defaults (Yes/No or 1/0)

  • Healthy or sick (Yes/No or 1/0)

Classification problem

Linear regression is not appropriate for a qualitative (classification problem) response. Let’s try to understand with a simple example below:

We want to predict the probability of default based on the customer’s outstanding balance of their credit cards. What if we use linear regression, as shown in the plot above? We get some negative values of the estimated probability of default for balances close to zero. On the other hand, what if the balance is very large?

If we just extrapolate our linear regression line, we will get a value much bigger than 1 of our estimated probability of default. According to the knowledge of ...