Solution Review: Cleaning NYC Property Sales
Explore techniques for cleaning NYC property sales data including changing categorical values, removing missing and duplicate entries, and filtering outliers. This lesson helps you prepare datasets for more reliable analysis by improving data quality and consistency.
We'll cover the following...
1. Change values
In this task we had to change the values in the BOROUGH column according to the following rule:
1 --> Manhattan
2 --> Bronx
3 --> Brooklyn
4 --> Queens
5 --> Staten Island
By looking at the problem statement, we can see that we need to write similar code for all categories. We do each category one by one.
To change all instances of a specific value in a column, first, we need to find the rows where that value is present. To do this, we write our condition in line 4. df['BOROUGH'] == 1, gives us a list of True/False against each row. It is true for rows where the value of the BOROUGH column is ...