In Cassandra, performing a query that does not use the primary key is guaranteed to be inefficient because it will need to perform a full table scan querying all the cluster nodes.

Methods to perform queries efficiently

Two alternatives can be used to solve the above problem:

  • Secondary indexes
  • Materialized views.

Secondary indexes

A secondary index can be defined on some columns of a table. This means each node will index this table locally using the specified columns. A query based on these columns will still need to ask all the system nodes, but at least each node will have a more efficient way to retrieve the necessary data without scanning all the data.

Materialized views

A materialized view can be defined as a query on an existing table with a newly defined partition key. This materialized view is maintained as a separate table, and any changes on the original table are eventually propagated to it. As a result, these two approaches are subject to the following trade-off.

Trade-offs with secondary indexes and materialized views

  • Secondary indexes are more suitable for high cardinality columns, while materialized views are suitable for low cardinality columns as they are stored as regular tables.

  • Materialized views are more efficient during read operations than secondary indexes because only the nodes that contain the corresponding partition are queried.

  • Secondary indexes are guaranteed to be strongly consistent, while materialized views are eventually consistent.

Get hands-on with 1300+ tech skills courses.