Optimizing API Interactions—Caller

Learn how to improve API interactions as a caller.

We can make and receive API calls now, but we’re not necessarily doing so efficiently. Our goal isn’t just to be able to get things done but to do so scalably.
API latency is the amount of time it takes for a request to be processed and a response to be received, and it is a crucial component of user experience. It is also one of the critical aspects in most distributed systems that we need to optimize.

Because, as back-end engineers, we can be both the caller and the receiver, learning how to optimize our API interactions in both scenarios is essential. Let’s start with the caller.

Use caching effectively

Caching is one of the most effective ways to reduce API latency. By storing the results of frequently used API calls, we can quickly retrieve them from the cache instead of making a new request to the server. This can significantly reduce latency for repeated requests, especially when we make GET calls to read a resource that is unlikely to change frequently.

If the resource we request has a small memory footprint (a few KBs at max.), we could use an in-memory cache. However, using an external cache like Redis might be wise for more complex and larger data structures. That would involve a network round trip, so be sure that that takes less time than actually calling the server owning the resource.

Use meaningful time-outs

When we make a synchronous API call to a server, we rely entirely on the time they take to respond to us before we can move forward. There are times when some APIs are made asynchronous and the client doesn’t wait for the server’s response. However, we must set meaningful time-outs for synchronous API requests, after which we will abort the call and error out. Our user might try again later.

But why not wait?

If a server we’re calling faces an outage, and our API call to them decides to accept our request but then struggles to respond, our call is now stuck indefinitely. As our server keeps getting more incoming requests and tries to service them by calling this malfunctioning server repeatedly, we will soon run out of resources, and our application will crash. Our concurrency will be inevitably impacted, and we will not be able to handle any meaningful scale of requests. It doesn’t sound pleasant.

So, what do we do?

Specify time-outs! Go’s default HTTP client sets no meaningful time-out for our calls. A big problem, if you ask us. When we built our network package, we ensured that we specified a default time-out even if the code calling it didn’t specify one. That was done to avoid such unpleasant scenarios.

Get hands-on with 1200+ tech skills courses.