Introduction to Two-Phase Locking (2PL)

Let's introduce the working principle of the two-phase locking mechanism for concurrency management.

Concurrency is everywhere in reading and writing to a data store. It poses challenges to prove that certain concurrent code is safe to execute in parallel with others, potentially interleaving arbitrarily. The problem of concurrent data access was thoroughly studied in the context of databases and transactions in 1974. The transaction is an abstraction to safely deal with concurrency (and many other data issues). We will discuss concurrency control in the context of transactions, though the concepts are more widely applicable.

This chapter will elaborate on potential problems of data transactions, and a specific strategy databases employ to prevent them, i.e., Two-Phase Locking (2PL). We will focus mainly on:

  • Concurrency management
  • Addressing numerous race conditions due to concurrency insurance
  • Implementations of databases’ isolation levels like serializability using 2PL

Race conditions

Multiple clients typically access databases at once. That is not a problem if they read and write to different database entries. We might still encounter concurrency issues if they access the exact database records. These are called race conditions.

Example

Imagine that two clients are incrementing a database-stored counter at the same time. Each client must read the current value, add 1, and then write back the new value, assuming the database doesn’t have a built-in increment operation. Due to the race condition, the counter only gets to 13 when it should have gone from 12 to 14 due to two increments.

Level up your interview prep. Join Educative to access 80+ hands-on prep courses.