High Availability
This lesson explains how high availability is implemented for HDFS.
We'll cover the following...
High Availability
High availability is characteristic of a distributed system. It is defined as the ability of a system or system component to be continuously operational for a long period of time. For example, Amazon’s ubiquitous S3 storage boasts a 99.99% availability over a given year.
To achieve high availability for HDFS, we need more than one instance of the Namenode to avoid downtime and failures during software/hardware upgrades . In HA setup, one Namenode serves client queries and is known as the Active Namenode. The rest are known as standby Namenodes. If the active Namenode experiences a failure, a standby Namenodes takes over.
Working
Imagine a cluster with two Namenodes. In order for the standby Namenode to successfully take over incase of failure, it exactly imitates the actions taken by the active Namenode on its namespace state. This is done by deploying JournalNodes. Like a journal, the JournalNodes keep a record of all the changes the active Namenode makes on its namespace. Because this is a distributed system, the changes are recorded to a majority of the JournalNodes. We need more than one JournalNode to record Namenode’s activities because JournalNodes themselves are prone to failure.
How do we define the ...