...

/

Running a PySpark Project using SPA

Running a PySpark Project using SPA

PySpark

In this lesson, we will set up Docker Job for PySpark in Single Page Application

Docker job for Single Page Application

Let’s see what each field in the above job means:

Select Docker job type

This is Docker Job Type selection in which we have to select what kind of docker job we are creating.

Live

Job name

This is just a job name for reference. You can use any name you want to specify for this job.

PySparkSPA

Input file name

Name of the input file you want to run in the live widget.

main.py

Run script

This script runs when we execute the code in the live widget. It is mandatory.

echo "Hello World"

Application port

We have to specify the port on which we want to run it.

8080

Start script

This script runs when we execute the Live Widget for the very first time. It is mandatory.

cd usercode && python3 main.py

Select Docker job

After creating the docker job for the SPA widget now select it as given below.

Let’s run PySpark code in SPA widget

from pyspark.sql import SparkSession
from dotenv import load_dotenv
def create_spark_session():
    """Create a Spark Session"""
    _ = load_dotenv()
    return (
        SparkSession
        .builder
        .appName("helloworld")
        .master("local[5]")
        .getOrCreate()
    )
spark = create_spark_session()
print('Session Started')

Note: We used SPA widget as sometimes we need extra resources and we can enhance resources only in our SPA widget.

Access this course and 1200+ top-rated courses and projects.