What is feature scaling?

In machine learning, feature scaling is a preprocessing step that ensures that the values of different input features are transformed so that they are on a similar scale. The purpose is to bring the features to a common range so that no feature dominates the other in the learning process.

Real-world example

Let's explore a real-world example of why feature scaling is an important concept in the machine-learning world. Imagine that we are working on a machine learning project to predict house prices. The features involved are dimensions in square feet, number of bedrooms, number of bathrooms, and neighborhood. For simplicity of the problem, let's focus on just two features: square footage and number of bedrooms. The data table is shown below:

House

Dimensions (Square feet)

Bedrooms

A

15000

4

B

2000

3

C

3421

4

As the problem is related to regression, for a rough idea, let's consider that the following is the modeling equation of the problem:

price = w1 * Dimensions + w2 * Bedrooms + b

By looking at the values, we can see that Dimensions will have a major contribution towards the model learning process because it has relatively higher values to the Bedrooms. This may cause problems during the training and learning process. Thus, we need to bring features to a common range so that no feature dominates the other depending on its number value.

Feature scaling techniques

Feature scaling can be performed using a number of techniques, some of which are listed below:

  • Min-max normalization

  • Standardization

  • Log scaling

  • Absolute maximum scaling

Min-max normalization

This feature scaling technique transforms a feature into a range from 0 to 1. The formula for this technique is:

In the formula, x is the feature, and min(x) and max(x) return the minimum and maximum values of the feature, respectively.

Z-score scaling (standardization)

This technique transforms the features so their distribution is from a 0 mean value to a standard deviation of 1. The formula for the technique is:

In the formula, is the mean of the feature x, and σ is the standard deviation.

Log scaling

The log scaling technique reduces a wide range to a small range by calculating the log of the feature values. The formula is given below:

In the formula, x is the feature value.

Absolute maximum scaling

The absolute maximum scaling technique scales the data by dividing every feature by the maximum value of the variable. The formula for applying this technique is:

In the formula, x is the feature value.

Coding example

Sklearn provides us with the functions to perform feature scaling. In the coding example below, we take a simple 2-dimensional array and apply standard scaling, log scaling, absolute maximum scaling, and min-max scaling.

from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import FunctionTransformer
from sklearn.preprocessing import MaxAbsScaler
import numpy as np

# Sample data
data = np.array([[1500, 3],
                 [2500, 4],
                 [1800, 2],
                 [2200, 3]])

# Initialize the scaling techniques
minmax_scaler = MinMaxScaler()
standard_scaler = StandardScaler()
log_transform = FunctionTransformer(np.log1p, validate=True)
maxabs_scaler = MaxAbsScaler()

# Fit and transform the data
scaled_data1 = minmax_scaler.fit_transform(data)
scaled_data2 = standard_scaler.fit_transform(data)
scaled_data3 = log_transform.transform(data)
scaled_data4 = maxabs_scaler.fit_transform(data)

# Apply the transformations

print("Original data:\n", data)
print("\nScaled data (Min-Max Scaling):\n", scaled_data1)
print("\nScaled data (Z-Score Scaling):\n", scaled_data2)
print("\nScaled data (Log Scaling):\n", scaled_data3)
print("\nScaled data (Max Abs Scaling):\n", scaled_data4)
Coding example to implement feature scaling using sklearn

Code explanation

  • Lines 1–4: We import the StandardScaler, MinMaxScaler, FunctionTransformer, and MaxAbsScaler from sklearn's preprocessing module.

  • Line 8: We create a 2-dimensional NumPwhat y array and fill it with dummy data on which we will apply scaling.

  • Lines 14–17: We create the instances of each scaling function and store them in a separate variable.

  • Lines 20–23: Using the fit_transform() method provided by each instance, we transform the data.

  • Lines 27–31: We display the result of the transformation applied by each scaling technique.

Wrapping up

Feature scaling is a data pre-processing step that acts as a bridge between the raw data and a successful machine learning model. By utilizing the appropriate scaling technique and integrating it into our preprocessing pipeline, we can empower our models to better understand and learn from our data, ultimately leading to accurate and valuable predictions.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved