Grokking the Machine Learning Interview/

...

Document Selection

From the one-hundred billion documents on the internet, let's retrieve the top one-hundred thousand that are relevant to the searcher's query.

We'll cover the following...

Document selection process
- Selection criteria
- Relevance scoring scheme

Press + to interact

From the one-hundred billion documents on the internet, we want to retrieve the top one-hundred thousand that are relevant to the searcher’s query by using information retrieval techniques.

Let’s get some terminologies out of the way before we start.

📝 Information retrieval is the science of searching for information in a document. It focuses on comparing the query text with the document text and determining what is a good match.

Documents

Document types are as follows:

Web-pages
Emails
Books
News stories
Scholarly papers
Text messages
Word™ documents
Powerpoint™ presentations

...

Introduction

Practical ML Techniques/Concepts

Search Ranking

Feed Based System

Recommendation System

Self-Driving Car: Image Segmentation

Entity Linking System

Ad Prediction System

Document Selection