Parallelism
This lesson explains the various Airflow settings that can affect the parallelism for a DAG and the Airflow cluster.
We'll cover the following...
In this lesson, we’ll examine the various aspects that affect parallelism at the DAG level and across the entire system.
Parallelism
The parallelism
configuration parameter limits the number of tasks that are actively being executed across the entire system. By default, this value is set to 16 in airflow.cfg
.
Pool
Pools are one of the ways to limit the number of tasks that run at any given time. A pool exists by default and is named the default pool, consisting of 128 worker slots. Each slot can be used to run a task. We can change the number of slots for a given pool and also create new pools, e.g., using the UI. A task can be ...