Grokking the Product Architecture Interview/

...

Estimation of Processing Time of an API

Learn to estimate the processing time of an API.

We'll cover the following...

Processing time
- Request processing estimation
- Query execution time
Estimating processing time
Discussion
Quiz

Let's start by estimating the processing time of an API.

Processing time

The processing time of a server is defined as the time a server takes to process a request to prepare a response. This is one of the important factors that affect response time. Therefore, estimating processing time is an important part of estimating the total response time of a service.

The illustration below is a high-level architecture of what constitutes processing time in an API. The server interacts with the database to execute queries for data retrieval that might also involve file handling. It includes the round trip from the API gateway to downstream services, the request execution time, and the response preparation time.

Press + to interact

There is no rule of thumb to calculate the exact processing time. It depends on several things, like the services, the components within the services, and the technologies (both hardware and software). Usually, the processing involves analyzing a query and fetching the data from the server’s memory or corresponding database. The processing time will primarily depend on three factors that are listed below:

The type of request
The application server’s time to handle a request
Database query execution time

The processing time depends on the machine’s specification, which is processing the user’s request. There are plenty of servers available with different specifications supporting different requirements. We’ll consider a typical server from Amazon Web Services (AWS) whose specifications are defined below:

Request processing estimation

In this section, we’ll estimate the time a server takes to handle a request depending on the type of request. Mainly, there are two types of requests that are bound by either CPU or memory.

CPU bound: These are requests where the CPU acts as a limiting factor.
Memory bound: These are requests where the memory acts as a limiting factor.

Let's say that each CPU-bound request takes 200 milliseconds (ms), and each memory-bound request takes 50 ms to complete. The requests per second (RPS) for each are calculated using the following formulas.

Component	Specification
Sockets	2
Processor	Intel Xeon X2686
RAM	240 GB
Cores	36 cores (72 hardware threads)
Cache (L3)	45 MB
Storage	15 TB

Introduction to the Course

Network Intricacies

Different Ways of Client-Server Communication

Common Data Formats for Web APIs

Comparison of API Architectural Styles

API Design Security

Important Concepts in Product Architecture

Back-of-the-Envelope Calculations for Latency

What Are the Foundational API Designs?

Design a Search Service

Design a File Service

Design a Comment Service

Design a Pub-Sub Service

Concluding Foundational Design Problems

YouTube Streaming API Design

YouTube

Facebook Messenger API Design

Google Maps API Design

Google Maps

Learn to Design a Chess API with AI Mentor

Zoom API Design

Zoom

Leetcode API Design

LeetCode

Payment Gateway API Design—Stripe

Stripe

Twitter API Design

Uber API Design

Uber

CamelCamelCamel API Design

CamelCamelCamel (C3)

Gaming API Design

API Failures and Mitigations

Evernote

Conclusion

Estimation of Processing Time of an API

Processing time

Server Specifications

Request processing estimation