What is the least response time load balancing technique?

Overview

The purpose of load balancers is to improve the performance of applications and decrease the burden by "efficiently"Efficiency depends on server selection. distributing the incoming traffic across a group of servers. For user-facing applications, this will result in improved response times.

Note: Here, we will mainly talk about application load balancers.

Now, let’s jump right into the details of the least response time load balancing technique.

About the technique

The least response time load balancing technique takes into account the current number of active connections on each server, plus the average response time. This load balancer forwards the new request to the server that is currently serving the lowest number of active connections and has the shortest average response time.

Note: The least response time load balancer is a dynamic load balancer, as it takes into account the current state of the servers while distributing incoming traffic.

Example

Let’s understand this with the help of an example:

  • Suppose we have three servers —ServerA, ServerB, ServerC— with active connections of 4, 2, and 0 and response times (that is also known as TTFBTime to first byte) of 3, 2, and 1 that are serving the requests behind the load balancer.

  • ServerC will receive the 1st request, as it currently has no active connections.

  • ServerC will also receive the 2nd request, as it currently has the lowest number of active connections.

  • ServerB and ServerC both have the lowest number of active connections, but ServerC has the shortest average response time. Hence, ServerC will also receive the 3rd request.

  • ServerB will receive the 4th request, as it currently has the lowest number of active connections.

  • Again, ServerB and ServerC have the lowest number of active connections, but ServerC has the shortest average response time. Hence, ServerC will also receive the 5th request, and the cycle will continue on this way.

Note: If two or more servers have the same number of lowest active connections and the same average response time, then the round-robin load balancing technique will be followed.

This is shown in the slides below:

Requests initially assigned to servers by the load balancer
1 of 6

Algorithmic explanation

  1. Find the server/s with the lowest active connections.

  2. If there are multiple servers with the lowest active connections, find the server/s with the shortest average response time. Following are some of the cases to note:

    Case1: If there are multiple servers with the same shortest average response time, then apply the round-robin method and assign the new request to the server that has its turn.

    Case2: If the server is exactly one, assign the new request to this server.

  3. If the server is exactly one, assign the new request to this server.

Advantages

  • The least response time load balancing technique increases the availability time of servers.

  • It assigns new requests evenly to each server to prevent overloading.

Limitations

  • Since it is non-deterministic, the least response time load balancer is difficult to troubleshoot.

  • The algorithm for the least response time load balancer is complex and requires more processing.

  • Its performance is dependent on how good the response time estimates are.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved