Introduction to Spanner

Learn why do we need a strongly consistent, distributed database where replicas can be anywhere on Earth.

A history of distributed databases

It was a system designer’s dream to build a globally distributed database with all the good features of a traditional relational database like strong consistency, ability to do complex transactions, consistent snapshots, and many more. However, achieving the features above with good performance and high availability proved hard. In pursuit of that dream, we had many different kinds of NoSQL databases.

We had a significant leap forward in realizing this dream with Google’s Spanner system. It is interesting how Spanner controlled the skew on clocks and utilized high-quality network infrastructure, to provide a globally distributed database with strongly consistent reads and writes. We will study this fascinating innovation in this chapter in detail.

Motivation

NoSQL databases are widely used for their benefits like flexible and evolving data models, scalability, and high performance. Even though NoSQL prioritizes scalability and performance, it is unable to ensure strong data consistency (primarily due to the challenges of the CAP theoremAccording to the CAP theorem, a distributed system cannot be both consistent and available in the presence of partitions, and choose one or the other.).

When NoSQL databases prioritize scalability, performance, and availability, they often trade off strong data consistency, which is a consequence of the CAP and PACELCIn theoretical computer science, the PACELC theorem is an extension to the CAP theorem. It states that in the case of network partitioning, P, in a distributed computer system, one has to choose between availability, A, and consistency, C, (as per the CAP theorem), but else, E, even when the system is running normally in the absence of partitions, one has to choose between latency, L, and consistency, C. [source:Wikipedia] theorems.

For example, re-entering the same dataset in NoSQL databases might be accepted without an error being thrown, but relational databases prevent duplicate rows from being added via integrity checks.

Most NoSQL solutions ...

Access this course and 1400+ top-rated courses and projects.