Rebalancing

Learn about the different ways to rebalance partitions.

Introduction

Rebalancing in a distributed system is data movement between multiple host instances. In the context of database partitioning, rebalancing means the movement of partitions between multiple host instances.

These are scenarios where a distributed database requires rebalancing:

  • If an existing host instance crashes, the database must migrate partitions from the existing host to other hosts.

  • If a new host instance joins the cluster, the database must reassign partitions to the new host from the existing ones for uniform distribution.

  • Scaling up host instances by adding more CPU, memory, etc., requires redistributing partitions between the hosts.

  • The database must redistribute partitions between host instances as the query throughput and dataset increase over time.

Prerequisites

There are certain prerequisites for rebalancing partitions in a distributed database:

  • After the rebalance operation, the data storage and read/write distribution on the partitions should be uniform between the host instances.

  • During the rebalance process, the ...