...

/

Failure Detection Strategies

Failure Detection Strategies

Learn about different failure detection strategies in a distributed environment.

Introduction

In a distributed environment with multiple replicas, failure detection of replicas is a critical component. A failure detector is a component that determines the liveness of a particular node. In other words, it detects whether the node is alive or if it has crashed. Unfortunately, detecting failures in a distributed environment is difficult, as it is impossible to determine whether the replica has crashed or is responding slowly due to network congestion or load.

Strategies

There are multiple strategies to detect failures in replicas:

  • Ping

  • Heartbeat

  • Phi Accrual failure detection

Ping

Ping is a ...