...

/

Handling Missing Data

Handling Missing Data

Learn how to handle missing data in pandas.

Missing data

Filling in missing data is another common operation, and this is important because many machine learning algorithms do not work if there is missing data. Also, it’s prudent to be aware of how much data is missing to ensure we get the full story from our data.

The “cylinders” column in our dataset has missing values. Remember our trick to calculate the count of items that have some property? We can use it here to determine the count of missing entries. We convert the property to booleans (using .isna), then call .sum on it:

Press + to interact
cyl = df.cylinders
print(cyl.isna().sum())

It’s hard to determine why these values are missing from the “cylinders” Series alone. Typically we’ll have more ...