System Design Deep Dive: Real-World Distributed Systems/

...

Detailed Design of Spark

Let's learn how Spark utilizes its driver and workers.

We'll cover the following...

Spark can read input data from any HadoopIt is a framework for distributed storage and big data processing that uses the programming model of MapReduce. data source. Every Spark application has driver and worker nodes. An important consideration is how Spark would know which applications need how many workers to be executed. This is the job of the cluster manager.

Cluster manager

There can be multiple applications running in Spark. If a user starts its own application while some other applications are already running on a cluster of machines, they would need resources to allocate to their tasks. This is where the cluster manager comes in. The driver uses cluster manager (an external service) to allocate a cluster of machines to the application. The cluster manager manages the cluster by keeping an eye on the failed workers and replacing them with another, greatly reducing the programming complexity we had to add to Spark.

The cluster managers that Spark can use include Mesos, YARN, and Spark’s standalone. The option that is available on all cluster managers is static partitioning of resources, meaning that each application gets maximum resources and holds on to them for the duration of its execution. However, the following resource allocations can be controlled:

The number of executors an application gets
The number of cores per executor
The executor memory

1.Prologue

2.File Systems

3.Google File System (GFS)

4.Google Colossus File System

5.Facebook's Tectonic File System

6.Databases

7.Google Bigtable

8.Google Megastore

9.Google Spanner

10.Key-value Stores

11.Many-core Key-value Store

12.Scaling Memcache

13.SILT

14.Amazon DynamoDB

15.Concurrency Management

16.Two-phase Locking (2PL)

17.Google Chubby Locking Service

18.ZooKeeper

19.Big Data Processing: Batch to Stream Processing

20.MapReduce

21.Spark

22.Kafka

23.Consensus

24.Understanding Consensus: Two Generals, FLP, & Byzantine Generals

25.Two-phase Commit

26.State Machine Replication

27.Paxos

28.Raft

29.Epilogue

Detailed Design of Spark

Cluster manager

Workflow of Spark