Ads Recommendation System Design

4. Calculation and estimation

Assumptions

  • 40K ad requests per second or 100 billion ad requests per month
  • Each observation (record) has hundreds of features, and it takes 500 bytes to store.

Data size

  • Data: historical ad click data includes [user, ads, click_or_not]. With an estimated 1% CTR, it has 1 billion clicked ads. We can start with 1 month of data for training and validation. Within a month we have, 100 * 101210^{12} * 500 = 5 * 101610^{16} bytes or 50 PB. One way to make it more manageable is to downsample the data, i.e., keep only 1%-10% or use 1 week of data for training data and use the next day for validation data.

Scale

  • Supports 100 million users

5. High level design

Get hands-on with 1400+ tech skills courses.