High Availability

High availability is characteristic of a distributed system. It is defined as the ability of a system or system component to be continuously operational for a long period of time. For example, Amazon’s ubiquitous S3 storage boasts a 99.99% availability over a given year.

To achieve high availability for HDFS, we need more than one instance of the Namenode to avoid downtime and failures during software/hardware upgrades . In HA setup, one Namenode serves client queries and is known as the Active Namenode. The rest are known as standby Namenodes. If the active Namenode experiences a failure, a standby Namenodes takes over.

Working

Imagine a cluster with two Namenodes. In order for the standby Namenode to successfully take over incase of failure, it exactly imitates the actions taken by the active Namenode on its namespace state. This is done by deploying JournalNodes. Like a journal, the JournalNodes keep a record of all the changes the active Namenode makes on its namespace. Because this is a distributed system, the changes are recorded to a majority of the JournalNodes. We need more than one JournalNode to record Namenode’s activities because JournalNodes themselves are prone to failure.

How do we define the ...

Hadoop

YARN

Map Reduce

HDFS

Spark

Input & Output Formats

Misc

Quiz

Reference: Replication

Reference: Partitioning

Reference: Transactions

Reference: Issues in Distributed Systems

High Availability

High Availability