A PySpark Primer

An overview of PySpark.

We'll cover the following...

What is PySpark?

PySpark is a powerful language for both exploratory analysis and building machine learning pipelines. The core data type in PySpark is the Spark dataframe, which is similar to Pandas dataframes but is designed to execute in ...