Architecture
This lesson describes the architecture of Spark.
We'll cover the following...
Architecture
Spark is a distributed parallel data-processing framework and bears many similarities to the traditional MapReduce framework. Spark has the same master-slave architecture as MapReduce, where one process, the master, coordinates and distributes work among slave processes. These two processes are formally called:
- Driver
- Executor
Driver
The driver is the master process that manages the execution of a Spark job. It is responsible for maintaining the overall state of the Spark application, responding to a user’s program or input and analyzing, distributing and scheduling work ...