Credit Scoring Problem

Learn about credit scoring problems using the German Credit Dataset.

The credit scoring problem

Before we dive deep into AI fairness, it is time to introduce a motivating example.

Credits and loans are essential aspects of modern society. Credit decisions can influence people’s lives, such as when buying a house. This task is also vital for financial institutions. Loans generate a lot of income, but unpaid loans are a considerable cost. The impact can be even worse: we saw this during the 2007 financial crisis related to unpaid mortgages.

In an ideal world, a bank can predict if a borrower will pay all liabilities perfectly. In such a situation, there are no unpaid installments, so there is no cost of debt collection. On the other hand, people who won’t be able to pay liabilities wouldn’t get the loan. That’s why banks invest a lot of money and effort to create a set of rules by which loan applications will be accepted. These models may or may not use machine learning. As George BoxBox GEP Draper NR. Empirical model-building and response surfaces. New York, NY: Wiley, 1987: Vol. 424. says, “All models are wrong, but some are useful.”

As machine learning practitioners, we know there is no perfect model, and we must deal with wrong predictions. We can easily formulate credit decision problems as a binary classification. We can denote credit denial as the negative class (y=0y=0) and credit approval as the positive one (y=1y=1). Of course, we could reverse labels. However, when analyzing fairness, it might change the interpretation of the results, so we must be cautious about which class we consider good. This setup gives us four possible outcomes. Let’s attempt the following credit scoring choice-matching problem to see whether we can correctly match the options:

Get hands-on with 1400+ tech skills courses.