...

/

Log-Structured Merge-Tree

Log-Structured Merge-Tree

Understand the semantics of the log-structured merge-tree.

Introduction

The Log Structured Merge Tree (LSM) is a disk-resident data structure to persist key-value pairs optimized for write-heavy query patterns.

LSM includes an append-only storage structure as the primary data structure to handle high write volumes. These storage structures are periodically merged and compacted in the background to eliminate duplicate and deleted records. LSM is a prominent data structure used in multiple NoSQL datastores such as Cassandra and Hbase.

Data structure

LSM includes four main data structures:

  • Memtable

  • SSTable

  • WAL

  • Bloom filter

Memtable

Memtable is an in-memory data structure that acts as the frontier for all the client requests handling reads and writes. Memtable implements a balanced binary search tree such as AVL tree, guaranteeing that the time complexity of insert, delete, update, and read requests be O(log2M)O(log_2M) ...