...

/

Apache Cassandra Primary Key and Partition Keys

Apache Cassandra Primary Key and Partition Keys

Get introduced to the significance and usage of primary key in Apache Cassandra, exploring the partition key component in detail and its role in organizing and querying data efficiently.

The primary key is the most important part of the Apache Cassandra data model. A table’s primary key accomplishes two tasks in Apache Cassandra:

  1. Guarantees the uniqueness of the record

  2. Defines placement of the record in the cluster

A table’s primary key in Cassandra comprises one or more partition keys and zero or more clustering columns. The partition key(s) always appear first in the primary key, followed by any clustering columns.

CQL data type uuid is a good option for primary keys as it guarantees uniqueness.

In Apache Cassandra, a table’s primary key cannot be altered.

Once a table has been created, it is not possible to modify the primary key. This is because in Apache Cassandra a table’s primary key dictates how the data is distributed in the cluster and how it is stored on disk.

Press + to interact
Apache Cassandra primary key
Apache Cassandra primary key

Partition key

The partition key is part of the primary key that defines where the record resides in the cluster. It is used for locating the node where the record is to be read from/written. A partition is a set of rows that have the same value for their partition key.

Simple partition key

A simple partition key comprises a single partition key column. A simple partition key is defined using the following syntax in the table definition:

PRIMARY KEY '(' partitionKeyColumn')' 
PRIMARY KEY '(' partitionKeyColumn ( ',' clusteringColumn )* ')'

The above statements instruct Cassandra to partition and distribute the table’s rows in the cluster based on the first column provided in the primary key. For example, the courses table described in the previous lessons had a simple partition key, i.e., the id ...