In machine learning, feature scaling is a preprocessing step that ensures that the values of different input features are transformed so that they are on a similar scale. The purpose is to bring the features to a common range so that no feature dominates the other in the learning process.
Let's explore a real-world example of why feature scaling is an important concept in the machine-learning world. Imagine that we are working on a machine learning project to predict house prices. The features involved are dimensions in square feet, number of bedrooms, number of bathrooms, and neighborhood. For simplicity of the problem, let's focus on just two features: square footage and number of bedrooms. The data table is shown below:
House | Dimensions (Square feet) | Bedrooms |
A | 15000 | 4 |
B | 2000 | 3 |
C | 3421 | 4 |
As the problem is related to regression, for a rough idea, let's consider that the following is the modeling equation of the problem:
price = w1 * Dimensions + w2 * Bedrooms + b
By looking at the values, we can see that Dimensions
will have a major contribution towards the model learning process because it has relatively higher values to the Bedrooms
. This may cause problems during the training and learning process. Thus, we need to bring features to a common range so that no feature dominates the other depending on its number value.
Feature scaling can be performed using a number of techniques, some of which are listed below:
Min-max normalization
Standardization
Log scaling
Absolute maximum scaling
This feature scaling technique transforms a feature into a range from 0 to 1. The formula for this technique is:
In the formula, x
is the feature, and min(x)
and max(x)
return the minimum and maximum values of the feature, respectively.
This technique transforms the features so their distribution is from a 0 mean value to a standard deviation of 1. The formula for the technique is:
In the formula, x̄
is the mean of the feature x, and σ is the standard deviation.
The log scaling technique reduces a wide range to a small range by calculating the log of the feature values. The formula is given below:
In the formula, x is the feature value.
The absolute maximum scaling technique scales the data by dividing every feature by the maximum value of the variable. The formula for applying this technique is:
In the formula, x is the feature value.
Sklearn provides us with the functions to perform feature scaling. In the coding example below, we take a simple 2-dimensional array and apply standard scaling, log scaling, absolute maximum scaling, and min-max scaling.
from sklearn.preprocessing import StandardScaler from sklearn.preprocessing import MinMaxScaler from sklearn.preprocessing import FunctionTransformer from sklearn.preprocessing import MaxAbsScaler import numpy as np # Sample data data = np.array([[1500, 3], [2500, 4], [1800, 2], [2200, 3]]) # Initialize the scaling techniques minmax_scaler = MinMaxScaler() standard_scaler = StandardScaler() log_transform = FunctionTransformer(np.log1p, validate=True) maxabs_scaler = MaxAbsScaler() # Fit and transform the data scaled_data1 = minmax_scaler.fit_transform(data) scaled_data2 = standard_scaler.fit_transform(data) scaled_data3 = log_transform.transform(data) scaled_data4 = maxabs_scaler.fit_transform(data) # Apply the transformations print("Original data:\n", data) print("\nScaled data (Min-Max Scaling):\n", scaled_data1) print("\nScaled data (Z-Score Scaling):\n", scaled_data2) print("\nScaled data (Log Scaling):\n", scaled_data3) print("\nScaled data (Max Abs Scaling):\n", scaled_data4)
Lines 1–4: We import the StandardScaler
, MinMaxScaler
, FunctionTransformer
, and MaxAbsScaler
from sklearn
's preprocessing module.
Line 8: We create a 2-dimensional NumPwhat y array and fill it with dummy data on which we will apply scaling.
Lines 14–17: We create the instances of each scaling function and store them in a separate variable.
Lines 20–23: Using the fit_transform()
method provided by each instance, we transform the data
.
Lines 27–31: We display the result of the transformation applied by each scaling technique.
Feature scaling is a data pre-processing step that acts as a bridge between the raw data and a successful machine learning model. By utilizing the appropriate scaling technique and integrating it into our preprocessing pipeline, we can empower our models to better understand and learn from our data, ultimately leading to accurate and valuable predictions.