What is horizontal and vertical scaling?

Scaling implies adding or removing resources from your machine/application to perform the underlying tasks with optimal cost and processing capability. Scalability is a challenge that every engineering team needs to go through.

Various scenarios require an application to:

  • Grow/shrink in requests
  • Increase/decrease in data
  • Reduce processing power

At this point, you will have two scaling options:

  1. Horizontal scaling
  2. Vertical scaling

Horizontal scaling (scaling out)

Horizontal scaling implies adding more machines to the existing system. The data is scattered across multiple machines, and each of them has its own capacity. As we are not modifying existing machines, this process involves less downtime.

This method allows us to perform distributed programming, which entails distributing jobs across devices. Horizontal scaling can increase the complexity as the address space increases. Therefore, updating and sharing data across machines can be more expensive.

Adding three new racks to the system

Vertical scaling (scaling up)

Vertical scaling implies attaching more resources to the existing machine. Let’s consider a server rack, as before. We add more things like RAM, for example, to the same server rack in this method. The data resides on the same machine and is not distributed as in horizontal scaling. Usually, the activities performed on these machines use multi-threading and in-process data passing methods. Vertical scaling will have limited capacity within the existing machine. Scaling beyond this capacity will cause downtime.

Adding CPU, RAM to the server rack

Let’s look at the key differences between these methods:


Horizontal Scaling

Vertical Scaling

Maintainance

Maintainance is complex as you will need to manage a lot of machines.

Maintenance is cheaper and it is less complex because of the number of nodes you will need to manage. 

Costs

Initial costs are high but buying a new machine with low processing power is more affordable.

Adding a new machine is far more expensive than upgrading old ones.  

Fault Tolerance

In case of failure in a machine, others can still provide the service.

Failures will lead to loss of service.

Communication Complexity

Having multiple machines requires complex protocols for exchanging data between them.

Data exchange becomes relatively straightforward as we only have one machine.

Load Balancing

Traffic/programming tasks can be distributed between the machines.

Since we have one device, tasks can't be spread. Some level of parallel processing is achievable using a multi-threading programming model, but it's limited to the machine's capacity.

Examples

Cassandra, Google Cloud Spanner

MySQL