Merging DataFrames
Learn how to merge DataFrames using Python.
We'll cover the following...
Introduction
Merging DataFrames involves combining two or more DataFrames into one. We combine DataFrames using the merge()
function and the concat()
function to create a new rich dataset we can use for further analysis, such as descriptive analysis or machine learning.
Merging using a common column
We use the merge()
function to combine data from two or more DataFrames based on a common column. This function can be particularly useful when working with large datasets because it allows us to quickly and easily merge them without manually matching and combining them.
It's important to note that the final DataFrame will only include records in which the values of the common column match. Records that only exist in only one table will be excluded.
ID,Company,Position,Salary,Start Date1,Acme Inc,Manager,50000,01/01/20202,XYZ Corporation,CEO,100000,01/01/20213,ABC Company,CTO,80000,01/01/20194,Def Corp,Developer,70000,01/01/20185,MNO Enterprises,Sales Representative,40000,01/01/20176,PQR Inc,HR Manager,60000,01/01/20167,STU Enterprises,Marketing Manager,50000,01/01/20158,VWX Group,Accountant,40000,01/01/20149,YZ Companies,Technical Writer,60000,01/01/201310,123 Enterprises,Project Manager,70000,01/01/201211,223 Treva,Technical Manager,170000,01/01/201212,133 Leakey,Project Manager,80000,01/01/201213,213 Yuno,Operations Director,250000,01/01/201214,134 Hons,Project Manager,90000,01/01/2012
We see a forward slash displayed in the first line of the output because we have further columns in our file to ...