Joins in DataFrames
Get a look at the theory for joining DataFrames.
We'll cover the following
Joins
Databases have different types of joins. The four common ones include inner, outer, left, and right. The DataFrame has two methods to support these operations, join
and merge
. It’s preferred to use the merge
method.
Note: The
join
method is meant for joining based on the index rather than columns. In practice, joining is usually based on columns instead of index values. If we want thejoin
method to join based on column values, we need to set that column as the index first:df1.set_index('name').join(df2.set_index('name'))
It’s easier to just use themerge
method.
The default join type for the merge
method is an inner join. The merge
method looks for common column names in the DataFrame it’s going to join. The method aligns the values in those columns. If both columns have values that are the same, they’re kept along with the remaining columns from both DataFrames. Rows with values in the aligned columns that only appear in one DataFrame are discarded:
Get hands-on with 1300+ tech skills courses.