Estimation of Latency of an API
Learn the factors involved in calculating the API latency induced by the network.
We know that the response time of an API consists of network latency and processing time, as depicted in the following equation:
We calculated the processing time of an API in the previous lesson. Now, let's estimate the latency of an API for different HTTP methods in the subsequent sections.
Introduction
Latency (also known as network latency) is the message propagation time between a client and a server. To estimate the total response time of a service, we need to estimate the latency of a service. Latency arising due to the network (such as the global Internet) is an important factor in the design of an API, because it answers the following key questions:
Is our service usable across different parts of the world?
What is the maximum time we can take to process a request (that is, the back-end processing time)? If network latency is high for some clients, the back-end services will have less time to complete the processing.
In a request-response architecture, the aim is to store or retrieve data from a server. This architecture usually targets performing POST
, GET
, UPDATE
, and DELETE
. Although each method has its importance, all the methods follow the same structure and request pattern. We can generally categorize these methods into two types of requests:
Pull request: The type of request where data is requested from the server. This
GET
method is an example of such a request. For such requests, the server's response message size is usually large.Push request: The type of request where we want to save data to the server or request an operation from the server side. Examples of such requests are
POST
,PUT
, andDELETE
. Generally, these methods have a larger ...