...

Dynamic Horizontal Scaling

Learn how to dynamically adjust the capacity of an application based on the incoming or predicted traffic.

We'll cover the following...

Using a service registry
Implementing a dynamic load balancer with http-proxy and Consul
- Try it yourself

One important advantage of modern cloud-based infrastructure is the ability to dynamically adjust the capacity of an application based on the current or predicted traffic. This is also known as dynamic scaling. If implemented properly, this practice can reduce the cost of the IT infrastructure enormously while still keeping the application highly available and responsive.

The idea is simple: if our application is experiencing a performance degradation caused by a peak in traffic, the system automatically spawns new servers to cope with the increased load. Similarly, if we see that the allocated resources are underutilized, we can shut some servers down to reduce the cost of the running infrastructure. We can also decide to perform scaling operations based on a schedule; for instance, we can shut down some servers during certain hours of the day when we know that the traffic will be lighter, and restart them again just before the peak hours. These mechanisms require the load balancer to always be up-to-date with the current network topology, knowing which server is up at any time.

Using a service registry

A common pattern to solve this problem is to use a central repository called a service registry, which keeps track of the running servers and the services they provide.

The illustration below shows a multiservice architecture with a load balancer on the front, configured dynamically using a service registry.

Press + to interact

The architecture in the illustration above assumes the presence of two services, API and WebApp. There can be one or many instances of each service, spread across multiple servers.

When a request to example.com is received, the load balancer checks the prefix of the request path. If it’s the /api prefix, the request is load balanced between the available instances of the API service. In the illustration above, we have two instances running on the api1.example.com server and one instance running on the api2.example.com server. For all the other path prefixes, the request is load balanced between the available instances of the WebApp service. In the illustration, we have only one WebApp instance, which is running on the web1.example.com server. The load balancer obtains the list of servers and service instances running on every server using the service registry.

For this to work in complete automation, each ...

Overview

The Node.js Platform

The Module System

Callbacks and Events

Asynchronous Control Flow Patterns with Callbacks

Asynchronous Control Flow Patterns with Promises and Async/Await

Coding with Streams

Creational Design Patterns

Structural Design Patterns

Behavioral Design Patterns

Universal JavaScript for Web Applications

Advanced Recipes

Scalability and Architectural Patterns

Messaging and Integration Patterns

Conclusion

Build a Rest API for an Image-Sharing App With Node.js

Dynamic Horizontal Scaling

Using a service registry