Design a Newsfeed

Learn to design a newsfeed.

What is a newsfeed?

A newsfeed of any social media platform (like Twitter, Facebook, and Instagram) is a list of stories generated by entities that a user follows. It contains text, images, videos, and other activities, such as likes, comments, shares, advertisements, and many more. This list is continuously updated and presented to the relevant users on the user’s home page. Similarly, a newsfeed system also displays the newsfeed to users from friends, followers, groups, and other pages, including a user’s own posts.

Requirements

Functional requirements

  • Newsfeed generation: The system will generate newsfeeds from pages, groups, friends, and followers that a user follows and must effectively curate the content by selecting and ranking it to decide which items (from many) should be displayed to the user first.

  • Newsfeed contents: The newsfeed may contain text, images, and videos.

  • Newsfeed display: The system should affix new incoming posts to the newsfeed for all active users based on some ranking mechanism. Once ranked, we show content to a user with higher-ranked content first.

Non-functional requirements

  • Scalability: Our proposed system should be highly scalable to support the ever-increasing number of users on any platform, such as Twitter, Facebook, and Instagram.

  • Fault tolerance: The system should be handling a large amount of data; therefore, partition tolerance (system availability in the event of network failure between the system’s components) is necessary.

  • Availability: The service must be highly available to keep the users engaged with the platform. The system can compromise strong consistency for availability and fault tolerance, according to the PACELC theoremThe PACELC theorem is an extension of the CAP theorem that states, in the event of a network partition, we should choose between availability or consistency. Otherwise, choose between latency and consistency..

  • Low latency: The system should provide newsfeeds in real-time. Therefore, the maximum latency should not be greater than two seconds.

Building blocks we will use

The design of the newsfeed system utilizes the following building blocks:

  • Databases are required to store the posts from different entities and the generated personalized newsfeed. They’re also used to store users’ metadata and their relationships with other entities, such as friends and followers.
  • The cache is an important building block to keep the frequently accessed data, whether posts and newsfeeds or users’ metadata.
  • Blob storage is essential to store media content, for example, images and videos.
  • The CDN effectively delivers content to end users, reducing delay and burden on back-end servers.
...