Effective Data Manipulation with pandas/

...

Categorical Manipulation

Explore the handling of categorical data in pandas.

We'll cover the following...

Categorical data
Frequency counts
Benefits of categories
Conversion to ordinal categories
The cat accessor
Category gotchas
Generalization
Summary

So far, we have dealt with numeric and date data. Another common form of data is textual data, and a subset of textual data is categorical data. Categorical data is textual data that has repetitions.

Categorical data

Categories are labels that describe data. Values are oftentimes repeated, and when they have an intrinsic order, they are referred to as ordinal values. One example is shirt sizes: small, medium, and large. Unordered values such as colors are called nominal values. We can convert numerical data to categories by binning them.

We’ll start by looking at the categorical values found in the fuel economy dataset. The make column has categorical information:

Press + to interact

Introduction

Series Deep Dive

DataFrames

Manipulating Data

Wrapping Up

Appendix

Categorical Manipulation

Categorical data

Frequency counts

Benefits of categories