Scaling

This lesson explains how scaling helps cleaning the data.

Introduction to scaling #

The scale of your features matter for many machine learning algorithms. Having income values that range from 100 to 100,000 and ages that range from 0 to 100 can cause issues because of the large difference in scale of these two data columns. To deal with this, it is standard to rescale the data. There are many ways to do this, but the two most common ones are:

  • Standard scaling
  • Min/Max scaling.

Types of scaling #

Standard scaling #

Standard scaling subtracts the mean and divides by the standard deviation. This centers the feature on zero with unit variance.