High Availability

This lesson explains how high availability is implemented for HDFS.

High Availability

High availability is characteristic of a distributed system. It is defined as the ability of a system or system component to be continuously operational for a long period of time. For example, Amazon’s ubiquitous S3 storage boasts a 99.99% availability over a given year.

To achieve high availability for HDFS, we need more than one instance of the Namenode to avoid downtime and failures during software/hardware upgrades . In HA setup, one Namenode serves client queries and is known as the Active Namenode. The rest are known as standby Namenodes. If the active Namenode experiences a failure, a standby Namenodes takes over.

Working

Imagine a cluster with two Namenodes. In order for the standby Namenode to successfully take over incase of failure, it exactly ...

Access this course and 1400+ top-rated courses and projects.