Numerical Features
Learn about processing numerical features.
We'll cover the following...
Machine learning models accept numerical features, but giving the features directly to processing may be harmful and can lead to bad predictions. In this section, we will learn about the pre-processing of numerical features.
Scaling
Scaling refers to taking the feature in an acceptable range. Sometimes we get features which have a very high range. Consider a scenario where we have to predict whether a person likes discounts for a toy item or not. We were given two features: age and total spending. Age ranges between 18 to 65 and total spending ranges between 30 to 20000. If we use an algorithm that directly uses the value of these features, it will be highly biased towards the total spendings. See the below equation:
Discount_Score = P × AGE + Q × SPEND + R
We train the model, and it calculates the value of P, Q, and R as 2, 1, 40. If the discount score is greater than 4000, the person will like the toy item discount
E1: Now Age = 30, Spend= 4000 => Discount_Score = 302 + 40001 + 40 = 4100.
...