Introduction to the Flow Library
Get an overview of the Flow library in Elixir and then create a new mix project containing this dependency.
The Enum
and Stream
modules
Since Elixir is a functional language and all data is immutable, most Elixir developers quickly get accustomed to using functions like map
, filter
, and reduce
on a daily basis. These and other data-processing functions, found in the Enum
and Stream
modules, are essential to functional programming and help us transform data in various ways.
The limitations of Enum
and Stream
However, as the amount of data we have to process grows, so does the time it takes to finish the work. We already have a few tools at our disposal to run code concurrently, but implementing frequently used functions like reduce
and group_by
in parallel is going to be challenging. Thankfully, there is already a solution available to us on the Hex
registry.
Enter Flow
In this chapter, we’ll learn about Flow
—a powerful library with a simple API that makes processing large collections of data a breeze. The Flow
library uses GenStage
under the hood, so all operations will run parallel in separate stage processes and take care of back-pressure for us.
Compare Flow
with Enum
and Stream
First, we’ll introduce Flow
and comparE it to the commonly used Enum
and Stream
modules by analyzing airport data from around the world. We’ll see how easy it is to convert existing code to run concurrently and work with large datasets. Then, we’ll look at how to run reduce
operations concurrently and handle infinite and slow streams of data. Finally, we’ll revisit the scraper
project and integrate Flow
with an already running GenStage
pipeline. This will give us some extra flexibility when solving problems. Let’s get started!
Create a new mix
project
Before we begin, let’s scaffold a new project to work on. We’ll build a simple utility to help us analyze airport data by country. We’ll see which countries and territories have the largest number of working airports globally. Let’s call this application airports
:
mix new airports
Include flow
as a dependency
Now, let’s edit mix.exs
to add flow
as a dependency. We’ll also need a CSV parser. Therefore, we change our dependencies list to the following:
Get hands-on with 1300+ tech skills courses.