...

/

Spark's Java Main Abstraction: The DataFrame

Spark's Java Main Abstraction: The DataFrame

Get introduced to Spark's main abstraction in this lesson.

What is a DataFrame?

A DataFrame is both a logical container of data and an API, purposely built as a higher abstraction to the RDDs, as an older Spark abstraction in the case of the Java API and JavaRDDs.

In the Spark context, “logical container” defines a placeholder for data that spark loads and distributes, while the worker nodes process on an actual physical cluster.

The ...

Access this course and 1400+ top-rated courses and projects.