In the previous chapters of this course, we have looked at data cleaning, data exploration, and statistical inference. Now it is time to move to the last stage of the data science lifecycle, which is making predictions that help us in decision making. Until now, we have discovered interesting patterns in the data that we know were significant, but how do we use these patterns to predict future events? With this objective, we make predictions using models.

Modeling

A model is a representation of a system. It tries to approximate real-world phenomena. For instance, Isaac Newton gave us a model for tries to approximate gravity. We can make predictions using the model that how far or high a ball will go if we throw it with a certain force. In the gravity model, there are certain factors that affect the outcome such as the force with which the ball is thrown, or the mass of the ball, and so on. In the same way, we can make models to predict whether a certain client will default on the credit card payment next month or not, where the client’s history of payments and some other factors may affect the outcome of our model.

There are many different ways of making models and measuring their effectiveness. But first, let’s start with a very simple model.

Now a very simple model can be that the customer always pays $15 \%$ of the total bill as the tip. So mathematically:

predicted\: tip = 0.15 * total\_bill

Here $0.15$ is called our model parameter. If we denote the model parameter with $\theta$ , total_bill value with $x$ and the predicted tip with $\hat{y}$ then the above equation becomes

\hat{y} = \theta x

So, our simple model becomes a mathematical function, $f(x)$ , that takes in an input $x$ and gives us the output of predicted tip.

Let’s first create a column to check the percent tips.

What is Data Science

Python Basics

Handling Tabular Data in Python

Data Cleaning

Exploratory Data Analysis

Statistical Inference

Predictive Models

Machine Learning

How to Predict the Traffic Volume Using Machine Learning

A Simple Model

Modeling

Predicting waiter tips

Loss functions