The compare
method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side.
The compare
method can only compare DataFrames of the same shape, with exact dimensions and identical row and column labels.
Note: To learn more about pandas, please visit this link.
DataFrame.compare(other, align_axis=1, keep_shape=False, keep_equal=False)
The compare
method accepts the following parameters:
other
: This is the DataFrame
for comparison.align_axis
: This indicates the axis of comparison, with 0
for rows, and 1
, the default value, for columns.keep_shape
: This is a boolean parameter. Setting this to True
prevents dropping of any row or column, and compare
drops rows and columns with all elements same for the two data frames for the default value False
.keep_equal
: This is another boolean parameter. Setting this to True
shows equal values between the two DataFrames, while compare
shows the positions with the same values for the two data frames as NaN
for the default value False
.import pandas as pddata = [['dom', 10], ['chibuge', 15], ['celeste', 14]]df = pd.DataFrame(data, columns = ['Name', 'Age'])data1 = [['dom', 11], ['abhi', 17], ['celeste', 14]]df1 = pd.DataFrame(data1, columns = ['Name', 'Age'])print("Dataframe 1 -- \n")print(df)print("-"*5)print("Dataframe 2 -- \n")print(df1)print("-"*5)print("Dataframe difference -- \n")print(df.compare(df1))print("-"*5)print("Dataframe difference keeping equal values -- \n")print(df.compare(df1, keep_equal=True))print("-"*5)print("Dataframe difference keeping same shape -- \n")print(df.compare(df1, keep_shape=True))print("-"*5)print("Dataframe difference keeping same shape and equal values -- \n")print(df.compare(df1, keep_shape=True, keep_equal=True))
pandas
module.df
from the list called data
. df
has two columns: Name
and Age
.df1
from the list called data1
. df1
has two columns: Name
and Age
.df
and df1
.compare
to obtain the difference between the two DataFrames df
and df1
.compare
to obtain the difference between the two DataFrames, df
and df1
, while setting keep_equal
to True
. We can see that similar values are not omitted in the printed difference.compare
to obtain the difference between the two DataFrames, df
and df1
, while setting keep_shape
to True
. We see that the row with the same values for the two DataFrames is not omitted in the printed difference.compare
to obtain the difference between the two DataFrames, df
and df1
, while setting keep_shape
and keep_equal
to True
. We see that the row with the same values for the two DataFrames is not omitted in the printed difference, nor are the values of the positions with the same values for the two DataFrames.