Duplicates
This lesson will focus on how to deal with data that has duplicates.
We'll cover the following
Duplicates
Repeated data rows in the dataset are called duplicates. These can arise from a number of ways. The most common are:
-
The same data is entered twice by accident, such as the same article is scraped twice or booking for an online product is made twice.
-
If data is being collected in online forms or surveys and the user presses the submit button twice.
-
If data is collected from multiple sources.
Get hands-on with 1400+ tech skills courses.