Merging in a DataFrame
Let's see how to merge DataFrame objects.
We'll cover the following
The pandas provides a SQL-style join function, merge. It’s a high performance in-memory join operation. If you are familiar with SQL, you probably know that when you join two tables, an on clause is needed.
Two DataFrames would be merged based on some columns with the same value, which must exist in both DataFrames.
The how parameter of merge() specifies how to determine which keys are to be included in the final table. If a key combination does not appear in either the left or right tables, the values in the joined table will be filled by NaN. Below is a summary of the how options and their corresponding SQL equivalent.
Merge | SQL | Description |
---|---|---|
left | LEFT OUTER JOIN | Use keys from left frame only |
right | RIGHT OUTER JOIN | Use keys from right frame only |
outer | FULL OUTER JOIN | Use union of keys from both frames |
inner | INNER JOIN | Use intersection of keys from both frames |
Get hands-on with 1300+ tech skills courses.