Introduction
Learn how to orchestrate ETL pipelines using Apache Airflow.
We'll cover the following...
Orchestration in ETL pipelines is the process of coordinating and managing the various tasks the pipeline executes. The more complex our pipelines are the more crucial it is to add some orchestration layer.
As we have seen throughout the course, the extract, transform, and load tasks of ETL pipelines can vastly differ from one pipeline to the other, and each pipeline is used for a different purpose. Also, pipelines are usually comprised of multiple tasks that must run sequentially for the whole pipeline to succeed. If one task fails, we should know about it and perhaps run the pipeline again.
In a real-world environment, we might have to manage and maintain tens of pipelines simultaneously. Each runs on a different schedule interval and serves data to different parts of the organization.
The solution to successfully managing this is ...