Feature Columns

Learn about feature columns and how they're used to extract data features.

Chapter Goals:

  • Learn about feature columns and how they’re used
  • Implement a function that creates a list of feature columns

A. Overview

Before we get into using a dataset of parsed protocol buffers, we need to first discuss feature columns. In TensorFlow, a feature column is how we specify what kind of data a feature contains. In this chapter, we’ll focus on the two most common types of feature data: numeric and categorical data.

Feature columns are incredibly useful for converting raw data into an input layer for a machine learning model. Once we have a list of feature columns, we can use them to combine tf.Tensor and tf.SparseTensor feature data into a single input layer. We’ll discuss more of this in the next chapter.

B. Numeric features

For numeric features, we create a feature column using tf.feature_column.numeric_feature. The function takes in the feature name as a required argument.

Press + to interact
import tensorflow as tf
nc = tf.feature_column.numeric_column(
'GPA', shape=5, dtype=tf.float32)
print(nc)

In the example above, nc represents a numeric feature column for the feature called 'GPA'. We used the shape keyword argument to specify that the feature must be 1-D and contain 5 elements. We also set the feature’s datatype to tf.float32.

Other less commonly used keyword arguments for the function are default_value and normalizer_fn.

The default_value keyword argument sets the default value for the feature column if the ...