Grokking the Machine Learning Interview/

...

Ranking

In this lesson, we'll explore different modeling options for the Tweet ranking problem.

We'll cover the following...

Modeling options

Modeling options

Some questions to consider are: Which model will work best for this task? How should you set up these models? Should you set up completely different models for each task? Let’s go over the modeling options and try to answer these questions. We will also discuss the pros and cons of every approach.

Logistic regression

Initially, a simple model that makes sense to train is a logistic regression model with regularization, to predict engagement using the dense features that we discussed in the feature engineering lesson.

A key advantage of using logistic regression is that it is reasonably fast to train. This enables you to test new features fairly quickly to see if they make an impact on the AUC or validation error. Also, it’s extremely easy to understand the model. You can see from the feature weights which features have turned out to be more important than others.

A major limitation of the linear model is that it assumes linearity exists between the input features and prediction. Therefore, you have to manually model feature interactions. For example, if you believe that the day of the week before a major holiday will have a major impact on your engagement prediction, you will have to create this feature in your training data manually. Other models like tree-based and neural networks are able to learn these feature interactions and utilize them effectively for predictions.

Another key question is whether you want to train a single classifier for overall engagement or separate ones for each engagement action based on production needs. In a single classifier case, you can train a logistic regression model for predicting the overall engagement on a Tweet. Tweets with any form of engagement will be considered as ...

Introduction

Practical ML Techniques/Concepts

Search Ranking

Feed Based System

Recommendation System

Self-Driving Car: Image Segmentation

Entity Linking System

Ad Prediction System

Ranking

Modeling options

Logistic regression