ETL Pipeline Exercise: Extracting Data
Let’s extract media data from a PostgreSQL database.
A case study
Suppose we’re data engineers working for a digital company and we’re tasked with creating an ETL pipeline.
Our company, “Fakebook,” has created a social media application that users use worldwide. This application constantly generates data stored in the company’s production database for management.
The company wants to process and analyze the data collected by the application to generate insights and identify usage patterns. However, these analyses in the production database will introduce a heavy load. This is why the company has decided to separate the computing and storage of the data and perform all the analysis in a separate repository ...