Assertion Functions
Discover how to use assertions functions to test data integrity in pandas.
We'll cover the following...
Assertion functions
An important part of ensuring data integrity in analysis and modeling processes is the use of assertions. Assertions allow us to set up checks to confirm that our code behaves as expected. The pandas
library provides a module named testing
that comes with assertion functions for comparing pandas
objects with one another.
It’s useful for unit tests and data quality checks so that we can catch errors early before they cause problems down the line. There are numerous assertion functions available, but we’ll focus on the commonly used ones:
Overview of Commonly Used Assertion Functions
Assertion Function | Description |
| Checks that left and right DataFrames are equal |
| Checks that left and right |
| Checks that left and right indexes are equal |
| Checks that left and right |
DataFrame equality
The following example shows how assert_frame_equal()
can be used to check the equality of two DataFrames:
# Import assertion functionsfrom pandas.testing import assert_frame_equal# Generate pair of DataFrames (equal) to checkdf_left = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})df_right = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})# Check equality for df pairprint(' === Assertion Check === ')assert_frame_equal(df_left, df_right)
The example above shows that when the DataFrames are equal, there will be no output returned. The lack of output indicates that the assertion check has passed. On the other hand, if we have DataFrames with differences in values, we’ll get an AssertionError
, as shown in the example below:
# Import assertion functionsfrom pandas.testing import assert_frame_equal# Generate pair of DataFrames (unequal) to checkdf_left = pd.DataFrame({'A': [9999, 2], 'B': [3, 4]}) # Change value 1 to 9999df_right = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})# Check equality for df pairprint(' === Assertion Check === ')assert_frame_equal(df_left, df_right)
The good thing about these assertion checks is that the AssertionError
error displays clear information on where the inequality ...