Design Typeahead Suggestion
Learn to design a typeahead suggestion system.
Introduction
Typeahead suggestion, also referred to as the autocomplete feature, enables users to search for a known and frequently searched query. This feature comes into play when a user types a query in the search box. The typeahead system provides a list of suggestions to complete a query based on the user’s search history, the current context of the search, and trending content across different users and regions. Frequently searched queries always appear at the top of the suggestion list. The typeahead system doesn’t make the search faster. However, it helps the user form a sentence more quickly. It’s an essential part of all search engines that enhances the user experience.
Requirements
In this section, we look into the requirements and estimated resources that are necessary for the design of the typeahead suggestion system. Our proposed design should meet the following requirements.
Functional requirements
The system should suggest top (let’s say top 10) frequent and relevant terms to the user based on the text a user types in the search box.
Non-functional requirements
-
Low latency: The system should show all the suggested queries in real-time after a user types. The latency shouldn’t exceed 200 ms. A study suggests that the average time between two keystrokes is 160 milliseconds. So, our time budget for suggestions should be greater than 160 ms to give a real-time response. This is because if a user is typing fast, they already know what to search and might not need suggestions. At the same time, our system response should be greater than 160 ms. However, it should not be too high because in that case, a suggestion might be stale and less useful.
-
Fault tolerance: The system should be reliable enough to provide suggestions despite the failure of one or more of its components.
-
Scalability: The system should support the ever-increasing number of users over time.
Building blocks we will use
The design of the typeahead suggestion system consists of the following building blocks that have been discussed in the elementary design problems of the course:
Databases are required to keep the data related to the queries’ prefixes.
Load balancers are required to disseminate incoming queries among a number of active servers.
Caches are used to keep the top N suggestions for fast retrieval.
High-level design
According to our requirements, the system shouldn’t just suggest queries in real-time with minimum latency but should also store the new search queries in the database. This way, the user gets suggestions based on popular and recent searches.
Our proposed system should do the following: ...