Introduction to Typeahead Suggestion System [backup]
Understand the typeahead suggestion system and its functional and non-functional requirements.
Introduction
Typeahead suggestion or autocomplete feature enables users to search for a known and frequently searched query. When a user types a query in the search box, this feature comes into play. The typeahead provides a list of suggestions to complete a query based on the user’s search history, the current context of the search and trending content across users or regions. Frequently searched queries always appear on the top of the suggestion. The typeahead does not make the search fast; however, it helps the user to form a sentence quickly. It is an essential part of all search engines which assist in enhancing the user experience.
In this lesson, we will look into the requirements and estimated resources necessary for the design problem.
Requirements
Our design of the Typeahead system should meet the following requirements.
Functional requirements
The system should suggest top N (say top 10) frequent and relevant terms to the user based on the text a user types in the search box.
Non-functional requirements
-
Low Latency: The system should show suggested queries in real-time after a user types; therefore, the latency should not exceed 200 ms (A study suggests that the average time between two keystrokes is 160 milliseconds. So our time-budget of suggestions should be greater than 160 ms but not very far away to give real-time response).
-
Fault tolerance: The system should be reliable enough to provide suggestions despite the failure of one or more of its components.
-
Scalability: The system should support the ever-increasing number of users over time.
Capacity estimation
As stated earlier, the typeahead feature is used to enhance the user experience while typing a query. We need to design a system that works on a scale similar to Google Search. Google receives more than 3.5 billion searches each day. Therefore, designing such an enormous system is a challenging task requiring various kinds of resources. Let’s estimate the storage and bandwidth requirements for the proposed system.
Storage estimation
Assuming that out of 3.5 billion queries per day, 2 billion queries are unique that need to be stored. Also, assume that on average each query consists of 15 characters, and each character takes 2 bytes of storage. According to this formulation, we would require:
...
Create a free account to access the full course.
By signing up, you agree to Educative's Terms of Service and Privacy Policy