Concatenate
Explore how to use the pandas concat() function to combine DataFrames both vertically and horizontally. Learn to manage indexes with options like ignore_index and hierarchical indexing, and understand set operations for joining data. This lesson enables effective data stitching to enhance your data analysis workflows.
Concept of concatenate
The term concatenate is defined as the action of linking or stitching objects together in a chain. In the context of pandas, we concatenate different pandas objects together, e.g., a DataFrame with another DataFrame. The type of concatenation we can perform on the objects depends on the axis of the linkage, i.e., either row-wise or column-wise.
For this lesson, we’ll use a mock dataset of cars insured by a motor insurance company to demonstrate how concatenation works.
Row-wise concatenation
In row-wise concatenation, we’re linking two pandas objects on top of one another (i.e., vertically).
Let’s say we have two DataFrames df_A and df_B, where df_A comprises data on cars with a model year before 2005 (inclusive). On the other hand, df_B comprises data on cars with a model year after 2006 (inclusive).
To create a complete dataset (from df_A and df_B) with all the car model years present, the logical thing is to stack one DataFrame on top of the other, i.e., ...