Table option: CLUSTERING ORDER BY

Discover how the CLUSTERING ORDER BY table option in Cassandra defines the sort order of clustering columns within a partition, pre-optimizing data retrieval for queries.

As mentioned earlier, a table’s primary key in Cassandra consists of one or more partition keys followed by zero or more clustering columns, with partition keys listed first.

Clustering columns ensures record uniqueness and determines the sorting order of rows within a partition. A Cassandra table can have zero or more clustering columns after the partition key in the PRIMARY KEY clause. Together, partition keys and clustering columns form the table’s primary key. Without clustering columns, each partition contains only a single row.

The CLUSTERING ORDER BY table option

By default, rows in a partition are sorted in ascending order by clustering column values. If queries require data in descending order, for instance, time series data retrieval from newest to oldest, the desired sorting order may be specified in the table schema.

Apache Cassandra accomplishes really fast reads by storing rows within a partition in sorted order on disk, providing pre-optimization to the fetch query. The CLUSTERING ORDER BY table option uses a comma-separated list of clustering columns, to specifically set the order ASC or DESC for the clustering column(s). 

Let’s recreate the courses table, specifying that the data should be partitioned on category and saved on disk sorted by ascending order of instructor and descending order, of course title.

Get hands-on with 1300+ tech skills courses.