Ingest with pandas

Learn how to ingest data into the data warehouse using pandas.

We'll cover the following...

pandas is a powerful, efficient, easy-to-use Python library for data analysis and manipulation. It comes with powerful functions for data ingestion as well as data cleaning, data wrangling, and data visualization.

Data ingestion contains two steps: importing data from the source and loading data into the destination. pandas natively supports different data formats for reading and writing, such as CSV, JSON, XML, and databases. Its optimized data structure makes the ingestion process easy and fast. pandas is also widely integrated into many libraries, like google-cloud, PySpark, etc., which simplifies the way to ingest data in many more places.

The pandas DataFrame

DataFrame is the core data structure in pandas that contains two-dimensional data and its labels similar to SQL tables or Excel sheets. Once source data is converted into DataFrame, ...

The pandas DataFrame format
The pandas DataFrame format