Achieving Serializability
Let's learn how to achieve serializability. Look at the mechanisms for concurrency control and the cases where each performs well.
We know that there are some potential anomalies from concurrent transactions that are not properly isolated.
There are different isolation levels that prevent these anomalies. Stronger isolation levels prevent more anomalies at the cost of performance.
We also provided some examples to illustrate the real-life consequences of these anomalies.
As the previous lesson, Consistency and Isolation, explained, the system should be strictly serializable to be completely protected against any of these anomalies.
Strictly serializable is the strongest isolation level. Then comes serializability, which we will see in this lesson.
A system that provides serializability guarantees that the result of any allowed execution of transactions is the same as that produced by some serial execution of the same transactions(hence its name).
In the context of isolation, the execution of multiple transactions that correspond to an ordering of the associated operations is also called a schedule.
You might ask: what does the same mean in the previous sentence? Let’s explain.
Types of serializability
There are two major types of serializability that establish two different notions of similarity.
View serializability
A schedule is a view equivalent to a serial schedule with the same transactions when all the operations from transactions in the two schedules read and write the same data values (“view” the same data).
Conflict serializability
A schedule is a conflict equivalent to a serial schedule with the same transactions when every pair of conflicting operations between transactions is ordered in the same way in both schedules.
It turns out that calculating whether a schedule is view serializable is a very computationally hard problem. More specifically, it is an NP-complete problem. This means that the time required to solve the problem using any currently known algorithm increases rapidly as the size of the problem grows. So, we will not analyze view serializability further.
However, determining whether a schedule is conflict serializable is a much easier problem to solve, which is one of the reasons conflict serializability is widely used.
Determining whether a schedule is a conflict serializable
To understand conflict serializability, we first have to define what it means for two operations to be conflicting.
Two operations are conflicting (or conflict) if:
- They belong to different transactions
- They are on the same data item, and at least one of them is a write operation, where a write operation inserts, modifies or deletes an object
As a result, we can have three different forms of conflicts:
- A read-write conflict
- A write-read conflict
- A write-write conflict
Trivial way
A trivial way to check if a schedule is a conflict serializable is to calculate all possible serial schedules, identify conflicting operations in them, and check if their order is the same as in the schedule under examination.
This is computationally heavy, as it requires us to compute all the possible permutations of transactions.
A more practical way of determining whether a schedule is conflict serializable is through a precedence graph.
Precedence graph
A precedence graph is a directed graph, where the:
- Nodes represent transactions in a schedule
- Edges represent conflicts between operations
The edges in the graph denote the order in which transactions must execute in the corresponding serial schedule.
As a result, a schedule is a conflict serializable if and only if its precedence graph of committed transactions is acyclic.
Let’s look at an example to get a better idea about this rule.
Example
The following illustration contains a schedule of three transactions , ...