ETL Example—Load
Learn how to load data into a PostgreSQL database from the shell.
We'll cover the following...
The last step in the ETL pipeline is to load the cleaned and processed data stored in clean_data.csv
into a PostgreSQL database.
Connecting to the database
First, we create a Bash script called load_data.sh
that will switch users from “root” to “postgres,” start a Postgres database, and run an SQL script to load the data. The SQL script will be stored in a file called load_data.sql
.
#!/bin/bash# Setting the project location as an environment variableexport WORKDIR='/usercode/ETL_Lottery'# Setting Postgres bin and data directories as environment variablesexport POSTGRES_BIN='/usr/lib/postgresql/16/bin'export POSTGRES_DATA='/var/lib/postgresql/data'# Create and start a new PostgreSQL database cluster.# Connect to Postgres and run the file: load_data.sqlsu - postgres -c "${POSTGRES_BIN}/initdb -D ${POSTGRES_DATA}${POSTGRES_BIN}/pg_ctl -D ${POSTGRES_DATA} startpsql postgres -f ${WORKDIR}/load_data.sql"
Deploying a PostgreSQL database: load_data.sh
...