Transformations (II): FlatMap and Distinct
Get introduced to the second set of basic transformations.
We'll cover the following
FlatMap
The FlatMap operation is an old resident of the functional programming paradigm realm. It can be tricky to understand conceptually. There are two key components to learn about regarding FlatMap’s purposes:
-
Being a map transformation in nature, it applies a function to each element of a collection. This is no different than the plain
map()
function described before. -
If the input is a collection of collections of elements (say a List of Lists, an array of arrays), it flattens the results into a single collection.
So fundamentally, objects are transformed in map
and flatMap
operations based on a function, but how the elements are processed differs. The former processes a single collection while the latter processes nested collections.
In Spark, the concept is similar, but it displays some differences, so let’s start by visualizing this graphically and then practicing it in the code example.
Get hands-on with 1400+ tech skills courses.