Final Design of Quora
Learn about the limitations of Quora's design and improve the design.
Limitations of the proposed design
The proposed design serves all the functional requirements. However, it has a number of serious drawbacks that emerge as we scale. This means that we are unable to fulfill the non-functional requirements. Let’s explore the main shortcomings below:
-
Limitations of web and application servers: To entertain the user’s request, payloads are transferred between web and application servers, which increases latency because of network I/O between these two types of servers. Even if we achieve parallel computation by separating the web from application servers (that is, the manager and worker processes), the added latency due to an additional network link erodes a user’s experience. Apart from data transfer, control communication between the router library with manager and worker processes also imposes additional performance penalties.
-
In-memory queue failure: The internal architecture of application servers log tasks and forward them to the in-memory queues, which serve them to the workers. These in-memory queues of different priorities can be subject to failures. For instance, if a queue gets lost, all the tasks in that queue are lost as well, and manual engineering is required to recover those tasks. This greatly reduces the performance of the system. On the other hand, replicating these queues requires increasing RAM size. Also, with the number of features (functional requirements) that our system offers, many tasks can get assembled, which results in insufficient memory. At the same time, it is not desirable to choke application servers with not-so-urgent tasks. For example, application servers should not be burdened with tasks like storing view counts for answers, adding statistics to the database for later analysis, and so on.
-
Increasing QPS on MySQL: Because we have a higher number of features offered by our system, few MySQL tables receive a lot of user queries. This results in a higher number of QPS on certain MySQL servers, which can result in higher latency. Furthermore, there is no scheme defined for disaster recovery management in our design.
-
Latency of HBase: Even though HBase allows high real-time throughput, its
latency is not among the best. A number of Quora features require the ML engine that has a latency of its own. Due to the addition of the higher latency of HBase, the overall performance of the system degrades over time.P99 P99 stands for 99th percentile. It means that 99 percent of the queries are entertained below a specific number. For example, P99 of 20 ms means that 99 percent of the queries will be replied to within 20 ms by a server.
The issues highlighted above require changes to the earlier proposed design. Therefore, we’ll make the following adjustments and update our design: