Outliers
This lesson explains what are outliers, why they happen and how to remove them.
We'll cover the following...
What is an outlier? #
Another area of cleaning can be dealing with outliers. First off, how do you define an outlier? This can require domain knowledge as well as other information, but a simple way to start is by taking a look at box plots:
Box Plot of Hours Per Week
The above plot was calculated with this command:
bbox = train_df['hoursperweek'].plot(kind="box")
Detection of an outlier #
...Access this course and 1400+ top-rated courses and projects.