System Design: Sequencer
Explore the necessity of globally unique identifiers UIDs in large distributed systems for event tracing and primary key assignment. Understand why traditional auto-increment features fail in sharded database environments.
We'll cover the following...
Motivation
Large distributed systems can process millions of events per second, such as user posts or financial transactions. Each event must be assigned a globally unique identifier. These identifiers typically serve as primary keys in storage systems. Single-node databases often use auto-increment columns, but this approach does not scale in distributed systems where multiple nodes generate identifiers independently. Distributed environments, including horizontally sharded tables, require a coordinated or decentralized strategy for generating a unique ID.
Additionally, unique IDs facilitate debugging and tracing. For example, Facebook’s
How do we design a sequencer?
We divide the comprehensive design of a sequencer into two lessons:
Design of a unique ID generator: Define system requirements and discuss three generation methods: UUIDs, databases, and range handlers.
Unique IDs with causality: Incorporate time as a factor in ID generation to handle causality.
Designing a unique ID generator for a distributed system is challenging. The next lesson examines the specific requirements for this system.