What is a causal inference?

Overview

Causality is the relationship between a cause and effect. In distributed systems, ordering of events is challenging, as timestamps across nodes cannot be compared. This is where causality is useful in the distributed system.

Causality and the happens-before relationship

Let’s define an event as something happening at one node. This something can be either sending or receiving a message or a local execution step.

An Event A is said to have happened before an Event B if, and only if, the following holds true:

  1. A and B occurred on the same node, and A occurred before B in that node’s local execution order.
  2. A is the sending of some message, and B is the receipt of that same message.
  3. There exists an event C such that A happened before C and C happened before B, as per conditions 1 and 2.

The happens-before relationship is a way to determine causality in a distributed system.

widget

In the figure above, we see:

  • B -> C: The B denotes the sending of message M1, and C denotes the receiving of the same message.
  • C -> D: They are events on the same node.
  • D -> E: The D denotes the sending of message M2 and E denotes the receiving of the same message.

Thus, we have a causal relationship A -> B -> C -> D -> E.

As there is no chain of events that can determine a relationship between A and F, we say that A||F, which means A is concurrent with F.

Conclusion

The event ordering process in the distributed system uses the notion of causality. This forms the base for most distributed system algorithms.

Free Resources