API Caching

Understand caching in API Gateway.

What is caching?

Caching is a savior for overloaded services. There are times when the demand for data far exceeds the velocity of that data. That means that the number of times the data is queried is far more than the rate at which the data changes. Caching the responses can save resources and improve latency in such a situation. If we make an API call to get a response, the first request goes through the entire process of computing the response. The next few requests don’t have to go through the whole process. Instead, we can remember the response of the first API call and return it as is to the subsequent calls (if the request is identical). The response is cached for some time, defined by the caching configuration, and it’s then cleared so that the subsequent requests go through the whole process again.

Example

Consider a weather forecast API. The expected accuracy of this API depends upon the use case. An airforce pilot may need a highly accurate response based on the most recent inputs. However, the average person can tolerate something lower than that. It’s perfectly fine if the response is a few minutes old. The forecast computation may be a complex process based on inputs from several external systems, followed by intense computation and analysis based on historical data. Repeating the process every time a user invokes the API doesn’t make sense.

This is a classic scenario for caching. We can define a cache at the API Gateway that remembers the response for some time. If there’s another request for the same API (within the defined time), it doesn’t go through all the computations. Instead, it returns the cached response. If too many users are using the API, such a design can lead to significant savings. This comes at the cost of accuracy. However, that’s not a serious problem if we can live with the stale response for some time.

Get hands-on with 1400+ tech skills courses.