Spanner Operations
Let's study the operations supported by the Spanner.
Spanner supports the following types of operations:
- Read-write transactions
- Read-only transactions
- Standalone (strong or stale) reads
Read-write transaction
A read-write transaction can contain both read and/or write operations. It provides full ACID properties for the operations of the transaction. More specifically, read-write transactions are not simply serializable, but they are strictly serializable.
Note: Spanner documentation also refers to strict serializability with “external consistency”, but both are essentially the same guarantees.
A read-write transaction executes a set of reads and write operations atomically at a single logical point in time.
Note: As explained earlier, Spanner achieves these properties using two-phase locking for isolation and two-phase commit for atomicity across multiple splits.
Workflow
The workflow for the read-write transaction follows the following sequence:
-
After opening a transaction, a client directs all the read operations to the leader of the replica group that manages the split with the required rows. This leader acquires read locks for the rows and columns involved before serving the read request. Every read also returns the timestamp of any data read.
-
Any write operations are buffered locally in the client until the point the transaction is committed. While the transaction is open, the client sends keepalive messages to prevent participant leaders from timing out a transaction.
-
When a client has completed all reads and buffered all writes, it starts the
. It chooses one of the participant leaders as the coordinator leader and sends atwo-phase commit protocol The two-phase commit is required only if the transaction accesses data from multiple replica groups. Otherwise, the leader of the single replica group can commit the transaction only through Paxos. prepare
request to all the participant leaders along with the identity of the coordinator leader. The participant leaders involved in write operations also receive the buffered writes at this stage. -
Every participant leader acquires the necessary write locks, chooses a
prepare
timestamp si that is larger than any timestamps of previous transactions, and logs aprepare
record in its replica group through Paxos. The leader also replicates the lock acquisition to the replicas to ensure they will be held even in the case of a leader failure. It then responds to the coordinator leader with theprepare
timestamp.
The following illustration contains a visualization of this sequence:
Spanner mitigating availability problems
It is worth noting that the availability problems from the two-phase commit are partially mitigated in this scheme because both the participants and the coordinator are essentially a Paxos group. So, if one of the leader nodes crashes, then another replica from that replica group will eventually detect that, take over and help the protocol make progress.
Spanner handling deadlocks
The two-phase locking protocol can result in deadlocks. Spanner resolves these situations via a