One-hot encoding is the representation of categorical variables as binary vectors. In Python, there are several ways to perform one-hot encoding on categorical data:
manual conversion
using scikit-learn
using Keras
Let’s have a look at how one-hot encoding can be performed in Keras.
The Keras API provides a to_categorical()
method that can be used to one-hot encode integer data. If the integer data represents all the possible values of the classes, then the to_categorical()
method can be used directly; otherwise, the number of classes can be passed to the method as the num_classes
parameter.
The code snippet below illustrates the usage of the to_categorical()
method:
import numpy as npfrom keras.utils import to_categorical### Categorical data to be converted to numeric datacolors = ["red", "green", "yellow", "red", "blue"]### Universal list of colorstotal_colors = ["red", "green", "blue", "black", "yellow"]### map each color to an integermapping = {}for x in range(len(total_colors)):mapping[total_colors[x]] = x# integer representationfor x in range(len(colors)):colors[x] = mapping[colors[x]]one_hot_encode = to_categorical(colors)print(one_hot_encode)
Free Resources