...

/

ETL Pipeline Exercise: Transform

ETL Pipeline Exercise: Transform

Learn about the transform social media pipeline using Apache Airflow.

To continue our pipeline implementation, we’ll now focus on transforming the extracted data. According to the business requirements and the schema of the data warehouse, there are a few issues we need to fix with our extracted data. They are:

  1. To change the month format of all date columns from numerical to text (for example, from 08 to Aug)

  2. To remove tabs and new lines from columns comment_text and post_text

  3. To bin the number of followers into three categories, ...