Introduction to Distributed Systems for Dummies/

...

Data Partitioning

Learn about data partitioning in a distributed system.

We'll cover the following...

What is partitioning?
- An example of partitioning
Why choose partitioning in the first place?
Key takeaways

Press + to interact

Apart from data size, sometimes supporting a large number of queries becomes pretty complicated if all the data is stored in one machine.

This is where partitioning is used.

What is partitioning?

Partitioning is a mechanism in which data is divided into smaller chunks based on some specific attributes. These chunks are called partitions.

One partition is independent of another partition. Two partitions can be stored in two different machines in a distributed system. A partition can also be treated as a standalone database table.

Note that another well-known term for this concept is called sharding ...

Introduction

What Distributed Systems Achieve for Us

Data in Distributed Systems

Communication Between Nodes

Data Processing in Large Scale

Distributed System Architectural Patterns

Case Study 1: Apache Spark

Case Study 2: Apache Druid

Conclusion

Data Partitioning

What is partitioning?