Ordinal encoding is a technique used to convert
Consider a scenario where the categorical variable represents colors such as red, green, and blue. These categories can be mapped to numerical values like 1, 2, and 3 using ordinal encoding.
Colors | Encoded colors |
Red | 1 |
Green | 2 |
Blue | 3 |
To execute ordinal encoding in Python, the following steps are typically followed.
The first step is to install the scikit-learn library to use the OrdinalEncoder package as follows:
pip install -U scikit-learn
The -U flag is used to upgrade a package to the latest version available.
The next step is to import the required libraries.
import pandas as pdfrom sklearn.preprocessing import OrdinalEncoder
In this step, we create a simple DataFrame, as shown below. We can also import our dataset.
colors = {'Colors': ['Red', 'Green', 'Blue']}df = pd.DataFrame(colors)
OrdinalEncoder classWe then initialize an instance of the OrdinalEncoder class and store it in the encoder variable as follows:
encoder = OrdinalEncoder()
In this step, we pass the Colors column to the fit_transform function to perform ordinal encoding, as shown below:
df['Colors_Encoded'] = encoder.fit_transform(df[['Colors']])
Note: The
OrdinalEncoderpackage can encode multiple columns simultaneously.
The following code shows how we can use the OrdinalEncoder package in Python:
# Import necessary librariesimport pandas as pdfrom sklearn.preprocessing import OrdinalEncoder# Create a sample DataFramecolors = {'Colors': ['Red', 'Green', 'Blue']}df = pd.DataFrame(colors)# Print the original DataFrameprint("Original DataFrame Before Ordinal Encoding:")print(df)# Initialize the OrdinalEncoderencoder = OrdinalEncoder()# Fit and transform the 'Colors' column using ordinal encodingdf['Colors_Encoded'] = encoder.fit_transform(df[['Colors']])# Display the DataFrame with the encoded columnprint("\nDataFrame after Ordinal Encoding:")print(df)
Lines 2–3: We import the required libraries, including pandas for data manipulation and the OrdinalEncoder package from the scikit-learn library for ordinal encoding.
Line 6: We create a sample DataFrame (df) with a categorical column named Colors.
Line 14: We initialize the OrdinalEncoder class.
Line 17: We fit and transform the Colors column using the ordinal encoding. The transformed values are stored in a new column named Colors_Encoded.
Lines 20–21: We display the DataFrame after applying ordinal encoding to observe the changes.
Free Resources