Joins in DataFrames

Get a look at the theory for joining DataFrames.

We'll cover the following

Joins

Databases have different types of joins. The four common ones include inner, outer, left, and right. The DataFrame has two methods to support these operations, join and merge. It’s preferred to use the merge method.

Note: The join method is meant for joining based on the index rather than columns. In practice, joining is usually based on columns instead of index values.

If we want the join method to join based on column values, we need to set that column as the index first:

df1.set_index('name').join(df2.set_index('name'))

It’s easier to just use the merge method.

The default join type for the merge method is an inner join. The merge method looks for common column names in the DataFrame it’s going to join. The method aligns the values in those columns. If both columns have values that are the same, they’re kept along with the remaining columns from both DataFrames. Rows with values in the aligned columns that only appear in one DataFrame are discarded:

Get hands-on with 1300+ tech skills courses.