System Design Deep Dive: Real-World Distributed Systems/

...

MapReduce: Evaluation

Let's see how well the MapReduce design meets its requirements.

We'll cover the following...

Evaluation
Performance analysis
Shortcomings

Evaluation

The use cases of our system span various problems, ranging from processing raw data to gaining insights from the derived data, but we can evaluate our design to fulfill the requirements for general use.

The MapReduce library inherently handles the mechanisms of automatic parallelization, data splitting, load balancing, and fault tolerance. Let’s see how our design fulfills additional non-functional requirements.

Fault tolerance

Since we deal with large datasets, fault tolerance is critical. Faults can happen at any stage or component. Let’s see each one in detail.

Worker failure: Let’s consider the scenario of a failed worker. The manager identifies any worker not responding to the periodic calls as a failure. Once the manager declares a worker as a failure, it reschedules all of its completed and in-progress tasks to another available worker (if the actual disk or server fails and is unreachable, and Reduce tasks haven’t fetched out completed map outputs yet, then the manager will need to get them rescheduled. If only the worker process fails, the manager will only need to reschedule the in-progress work). When the reassigned work is completed, the manager notifies all the reducers working on the processed data of the failed mapper and assigns them the new mapper address to fetch data. The manager also wipes down all the processed data by the failed mapper from its local disk to free up space.

In case of a reducer failure, the manager reassigns its failed tasks to another reducer and provides the new reducer with all the required information for data fetching.

Manager failure: Remember that when a MapReduce job starts, one worker takes on the role of a manager. Each user-spawned job will have its own manager. Therefore, the failure of a manager will only have a limited impact (on a specific user job). We should note that many MapReduce jobs are prolonged (for example, processing a crawl of the WWW), and manager failure can impact them badly. The MapReduce library does not deal with the manager’s failure and leaves it to the end users.

Our current implementation stops the MapReduce job once the manager fails. However, we can implement a fail-safe option by making the manager save its snapshots periodically and revert to the latest one in case of a failure (thereby reducing the impact of such failures).
Bad records: As discussed earlier, our design handles bad records by skipping them from the re-executions.

By handling all these situations, our system ensures fault tolerance. Let’s go through the semantics in case of failures.

Semantics in case of failures

We can further divide these semantics based on the deterministic or non-deterministic nature of the Map or Reduce functions.

Deterministic functions

In the case of the deterministic nature of the user-defined Map and Reduce functions, the output of our distributed implementation is similar to their non-faulting sequential execution. It provides a strong baseline for the users to predict and understand the behavior of their program.

The program achieves this property by relying on the atomic nature of the commits made by the Map or Reduce tasks. Both these tasks write their outputs to temporary files and GFS and change the names once entirely generated.

A Map task writes its output to $R$ temporary files, which is equivalent to the number of Reduce tasks. Once completed, the

...

Prologue

File Systems

Google File System (GFS)

Google Colossus File System

Facebook's Tectonic File System

Databases

Google Bigtable

Google Megastore

Google Spanner

Key-value Stores

Many-core Key-value Store

Scaling Memcache

SILT

Amazon DynamoDB

Concurrency Management

Two-phase Locking (2PL)

Google Chubby Locking Service

ZooKeeper

Big Data Processing: Batch to Stream Processing

MapReduce

Spark

Kafka

Consensus

Understanding Consensus: Two Generals, FLP, & Byzantine Generals

Two-phase Commit

State Machine Replication

Paxos

Raft

Epilogue

MapReduce: Evaluation

Evaluation

Fault tolerance

Semantics in case of failures