What Is Partitioning in Databases?

Learn about database partitioning and its benefits.

Introduction

Partitioning a database is the process of breaking down a massive dataset into smaller datasets and distributing these smaller datasets across multiple host machines. Every host instance can hold multiple smaller datasets.

Every record in the database belongs to exactly one partition. Each partition acts as a database that can perform read and write operations on its own. We can either fire the database query targeting a single partition or scatter it across multiple partitions.

There are two ways to scale a database:

  • Vertical scaling: Vertical scaling is upgrading the capacity of existing hardware by increasing the resources such as disk space, CPU, and memory. It is also called scaling up. The maximum upgrade limits the vertical scaling we can perform on CPU, memory, and other resources and can’t be done infinitely. Vertical scaling is also expensive beyond a certain point.

  • Horizontal scaling: Horizontal scaling is adding more machines and spreading data and load across multiple machines. It is also called scaling out. Partitioning is an excellent example of horizontal scaling. Scaling out data and load is cost effective with the use of commodity hardware.

These are the advantages of partitioning a database:

  • Support for large datasets: The data is distributed across multiple machines beyond what a single machine can handle, thereby supporting large dataset use cases.

  • Support for high throughput: Distributing data across multiple machines implies read and write queries can be independently handled by individual partitions. As a result, the database as a whole can support a larger throughput than what a single machine can handle.

Get hands-on with 1400+ tech skills courses.