Missing Data Detection and Calculations
Understand how to detect missing data and perform calculations involving them.
Detecting missing data
Before we can manage the missing (or null) values in our data, we need first to be able to detect them accurately. In pandas
, we have several methods and functions at our disposal to detect missing values.
In the previous lesson, we learned that NaN
isn’t considered equal to any value. It means that if we try to find missing data in Series
or DataFrame objects by comparing the values with np.nan
(e.g., using operators like ==
or >=
), it will not work.
As such, we should instead use the functions in pandas
to detect missing values across the different array data types, namely isnull()
and notnull()
.
Note: Both
isnull()
andnotnull()
are described as functions inpandas
, though they can also be used as methods withpandas
objects such as aSeries
or DataFrame (e.g.,df.isnull()
).
Suppose we have a mock dataset of patient information, as shown below:
Get hands-on with 1400+ tech skills courses.