Union, UnionByName, and DropDuplicates
Get introduced to Union, UnionByName, and DropDuplicates transformations in this lesson.
We'll cover the following
Union
The union
transformation allows us to combine two DataFrames, thus producing a new one containing the rows from both.
This operation has the following characteristics:
-
The schemas of both DataFrames have to be identical. This doesn’t detour much from the classical SQL UNION operation available in RDBMS.
-
Duplicate records are preserved and aggregated to the final results.
We are going to first present a graphical representation of this transformation, which illustrates an interesting property that makes union
an attractive transformation in specific scenarios.
Get hands-on with 1400+ tech skills courses.