Feed Ranking System Design

Learn about the Feed Ranking system design for the LinkedIn application.

4. Calculation & estimation

Assumptions

  • 300 million monthly active users
  • On average, a user sees 40 activities per visit. Each user visits 10 times per month.
  • We have 12 * 101010^{10} or 120 billion observations/samples.

Data size

  • Assume the click through rate is about 1% for 1 month. We collected 1 billion positive labels and about 110 billion negative labels. This is a huge dataset.

  • Generally, we can assume that for every data point, we collect hundreds of features. For simplicity, each row takes 500 bytes to store.

  • In one month, we need 120 billion rows. Total size: 500 * 120 * 10910^{9} = 60 * 101210^{12} bytes = 60 Terabytes. To save costs we can keep the last 6 months or 1 year of data in the data lake and archive old data in cold storage.

Scale

  • Supports 300 million users

5. High-level design

Get hands-on with 1400+ tech skills courses.