Introduction

Any system model includes a collection of processes pi,(i=1,2,n)p_{i},(i=1,2, \ldots n) and communication by message passing. An important requirement of the model is to reach a consensus even in the presence of faults. We assume that communication is reliable (cf. this definition :Reliable_communication ), but that process may fail. In this lesson, we only consider crash failures, while the system under consideration in this lesson may also contain Byzantine processes. In the following lesson, we denote the number of faulty processes by ff, and the total number of processes by nn. Thus, we sometimes specify an assumption that up to some number ff of the nn processes are faulty, meaning that the remainder of the processes is correct.

Consensus problem

In order to reach consensus, every process pip_{i} begins in the undecided state and proposes a single value viv_{i}, i=1,2,,ni=1,2, \ldots, n. Then, the processes communicate with each other, exchange values, and engage in a consensus algorithm or consensus protocol. Each process then sets the value of a decision variable did_{i}, entering the decided state in which it may no longer change.

Baliga (2017)Arati Baliga. Understanding Blockchain Consensus Models, 2017. Available online at https://www.persistent.com/wp-content/uploads/2018/02/wp-understandingblockchain-consensus-models.pdf. gives the following definition of the properties of a consensus protocol:

Properties of a consensus protocol

A consensus protocol has the following properties which determine its applicability and efficiency.

  1. Safety: A consensus protocol is determined to be safe if all processes produce the same output and the outputs produced by the processes are valid according to the rules of the protocol. This is also referred to as the consistency of the shared state.
  2. Liveness: A consensus protocol guarantees liveness if all non-faulty processes participating in consensus eventually produce a value.
  3. Fault tolerance: A consensus protocol provides fault tolerance if it can recover from the failure of a process participating in consensus.

Note: The CAP theorem (cf. this lesson) states that it’s impossible to achieve both safety and liveness in an unreliable system.

As we will see in the following, the FLP impossibility result states that although all three properties are crucial for a consensus protocol, it is impossible to construct deterministic consensus protocols which can ensure safety, liveness, and fault tolerance in an asynchronous system. This means that any distributed consensus system, especially on the Internet, cannot have all three features, thus it must sacrifice one of them. According to Baliga, the following statement holds:

“While fault tolerance is crucial for globally distributed networks to operate, distributed systems tend to choose between safety and liveness depending on their system requirements and assumptions”.

Note that this statement corresponds to our the statement where we determined that the only two reasonable approaches when dealing with unreliable networks are to sacrifice availability or to sacrifice consistency. The problem of processes being able to reach an agreement on a value has been formalized as the agreement or consensus problem and is often formally defined through properties like the following (Tushar Deepak Chandra et al. (1996)Tushar Deepak Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2):225-67, March 1996., George Coulouris et al. (2013)George Coulouris, Jean Dollimore, Tim Kindberg, and Gordon Blair. Distributed Systems. International computer science series. Boston: Addison-Wesley, 2013. Pearson Education Limited., Aljosha Judmayer et al. (2017)Aljosha Judmayer, Nicholas Stifter, Katharina Krombholz, Edgar Weippl, Elisa Bertino, and Ravi Sandhu. Blocks and Chains: Introduction to Bitcoin, Cryptocurrencies, and Their Consensus Mechanisms. Synthesis Lectures on Information Security, Privacy, and Trust. San Rafael, CA, 2017. Morgan & Claypool., Ajay D. Kshemkalyani et al. (2011)Ajay D. Kshemkalyani and Mukesh Singhal. Distributed Computing: Principles, Algorithms, and Systems. Cambridge, 2011. Cambridge University Press. and Nancy A. Lynch (1996)Nancy A. Lynch. Distributed Algorithms. The Morgan Kaufmann Series in Data Management Systems. San Francisco, California, 1996. Elsevier Science.):

Get hands-on with 1400+ tech skills courses.