...
/Partitioning and Replication in DynamoDB
Partitioning and Replication in DynamoDB
Learn how tables are partitioned and replicated in DynamoDB.
The reason why we chose a NoSQL schema is that it allows for easy partitioning of tables. Let's quickly refresh our understanding of partition and learn how we will partition our tables. We will conclude this lesson by understanding how our design replicates partitions.
Partitioning
As the name suggests, partitioning means dividing the database or a table and storing it in multiple nodes. This concept is also known as sharding. It is important to note that partitioning can be of two types, vertical and horizontal. We will briefly revisit both types in this lesson. Then, we will discuss how we will partition our design.
The purpose of partitioning is to distribute the load of read and write requests on several nodes. There are two ways to achieve this.
Vertical
Vertical partitioning is the splitting of a table by columns. The illustration below demonstrates vertical sharding.
In the example above, we have partitioned a table into two tables. Note how both tables have the same primary key.
Horizontal
Horizontal partitioning is the partitioning of rows in a table. This is useful for large tables since it allows us to partition a table with many rows; the partitions of the table will have fewer rows. Different from vertical partitioning, all partitions will have the same number of rows. There are better ways to partition when the number of rows is expected to be large that is partitioning horizontally. Read-write access to a large table stored on a single server is limited to the throughput capabilities of that server. If we split the entries in the table equally and store them on two different servers, the same table will have higher availability. The illustration below demonstrates horizontal partitioning.
Here, we can see that the resultant tables have the same schema. We've only split the entries in the original table into two tables with the same schema.
We will use horizontal sharding because our data will not have a fixed schema. Furthermore, another reason to choose horizontal over vertical partitioning is that the former is better for our design. We are expecting a vast number of rows in our tables, and we wish to distribute the throughput of our nodes across those entries. Usually, automating horizontal sharding is much easier as compared to vertical sharding—achieving a fair distribution of throughput among partitions with vertical sharding requires knowing how frequently columns are accessed.
Note: For a detailed explanation of partitioning, visit Data Partitioning lesson. ...