...

/

Achieve Scalability with Load Balancing

Achieve Scalability with Load Balancing

Learn how to achieve scalability with load balancing.

Load balancing

It would be best always to load balance our production application with at least two servers available to serve requests, so our application can stay online when a server restarts. At its simplest, load balancing is the act of making sure that all servers receive roughly the same number of requests over a given time. A well-balanced application will be less likely to develop hot nodes with stressed resource usage than other nodes. We can also add new servers to a well-balanced application to help reduce the load on all other servers in the cluster.

We’ll discuss the basics of load balancing before looking at how WebSockets can make achieving a well-balanced system more complex than a traditional HTTP-powered application.

The basics of load balancing

A load balancer is a specialized software that acts as a proxy between a client and servers that respond to requests. Requests are sent relatively to back-end servers in the round-robin, least connections, or based on the criteria we define. Load balancers provide many benefits, such as the ability to add or remove back-end servers quickly, create a fair distribution of work, and increase redundancy.

Here’s an example of a load that is not correctly balanced. The top application server has received many more requests than the other servers in the application.

Each server in this figure would have roughly 22 requests in a well-balanced application. A better balance allows for a more predictable usage of system resources.

We can use free open-source and commercial closed-source load balancers with our application as with all software. Most cloud service providers, such as Amazon Web Services, Google Cloud, and Digital Ocean, provide load balancers that work out-of-the-box. We may also opt for an open-source load balancer like HAProxy or nginx.

We will need to pick a load balancer that supports ...