...

/

Spark SQL Engine

Spark SQL Engine

Get an introduction to the Spark SQL engine and its two sub-components, Tungsten Project and Catalyst optimizer.

We'll cover the following...

Overview

Spark SQL allows developers to programmatically issue ANSI SQL:2003–compatible queries on structured data with a schema. Spark SQL was introduced in version 1.3. Since then, several higher-level functionalities have been built upon it. Some of these are:

  • Generates optimized query plans and the final execution of compact JVM code.

  • Serves as a bridge to external tools using database ODBC/JDBC connectors.

  • Adds the ability to read and write structured files in various formats like JSON, CSV, or Avro and convert them into temporary tables.

  • Connects to the Apache Hive metastore and tables. ...