Robust Scaling
Understand how outliers can affect data and implement robust scaling.
We'll cover the following...
Chapter Goals:
- Learn how to scale data without being affected by outliers
A. Data outliers
An important aspect of data that we have to deal with is outliers. In general terms, an outlier is a data point that is significantly further away from the other data points. For example, if we had watermelons of weights 5, 4, 6, 7, and 20 pounds, the 20 pound watermelon is an outlier.
A 2-D data plot with the outlier data points circled. Note that the outliers in this plot are exaggerated, and in real life outliers are not usually this far from the non-outlier data.
The ...