Missing Data
Learn about the source of, and potential remedies for, missing data.
We'll cover the following...
Causes of missing data
Missing data is a common occurrence when applying machine learning to business data. While there are many reasons for missing data, the following are the most common:
The data is collected via a manual process and is prone to errors (e.g., data being tracked in a spreadsheet).
Multiple datasets are joined together (e.g., joining database tables can produce missing values).
A particular feature is considered optional in the data source (e.g., an IT system).
Datasets are acquired from external sources (e.g., datasets acquired from governments often have missing values).
Missing data is so common, and strategies for dealing with missing data are critical for crafting the most valuable machine learning models. ...