Introduction to Logistic Regression
Learn about classification problems and logistic regression.
We'll cover the following...
In this chapter of the course, we’ll discuss:
The classification problem and logistic regression to find the answer to our problem.
How to interpret the results from logistic regression through the confusion matrix.
Overview
In the classification problems, rather than predicting a continuous or quantitative output value (that is, today’s stock price, house price, and others), we are interested in nonnumerical value, a categorical or qualitative output (that is, the stock index increase or decrease). In this section, our focus will be on learning logistic regression as a method for classification. The logistic regression model is one of the most widely used machine learning algorithms for binary classification problems.
Examples of binary classification
The convention for binary classification is to have two classes, 0 and 1, like the followings:
Win or loss
Pass or fail
Dead or alive
Spam or ham email
Insurance or loan defaults (Yes/No or 1/0)
Healthy or sick (Yes/No or 1/0)
Classification problem
Linear regression is not appropriate for a qualitative (classification problem) response. Let’s try to understand with a simple example below:
We want to predict the probability of default based on the customer’s outstanding balance of their credit cards. What if we use linear regression, as shown in the plot above? We get some negative values of the estimated probability of default for balances close to zero. On the other hand, what if the balance is very large?
If we just extrapolate our linear regression line, we will get a value much bigger than 1 of our estimated probability of default. According to the knowledge of ...