Indexing Operations

Learn how to manipulate indexes of a Series or a DataFrame.

Indexing is an overloaded term in the pandas world. Both a Series and a DataFrame have an index (the labels down the left side for each row). In addition, both types support the Python indexing operator ([]). But that’s not all. They both have attributes (loc and iloc) that we can index against (using the Python indexing operator).

Prepping the data and renaming the index

To help explain the various operations, we’ll take the automobile mileage data Series with the city miles per gallon values and insert each car’s make as the index. This is because many operations work on the index position while others work on the index label. If these are both integer values, it can be a little confusing, but it becomes clearer if the index has string labels.

We’ll use the rename method to change the index labels. We can pass in a dictionary to map the previous index label to the new label:

Press + to interact
city2 = city_mpg.rename(make.to_dict())
print(city2)

The rename method will return a new Series with the original values but new index labels. If we pass in a scalar value, it will change the name attribute of the Series in the new Series it returns, leaving the index intact.

To view the index, we can access the index attribute:

Press + to interact
print(city2.index)

The rename method also accepts a Series, a scalar, and a function that takes an old label and returns a new label or a sequence. When we pass in a Series and the index values are the same, the values from the Series that we passed in are used as the index:

Press + to interact
city2 = city_mpg.rename(make)
print(city2)

Careful, though! If we pass a scalar value (a single string) into rename, the index will stay the same, but the name attribute of the Series will update:

Press + to interact
print(city2.rename('citympg'))

Resetting the index

Sometimes we need a unique index to perform an operation. If we want to set the index to monotonic increasing, so that unique integers start at zero, we can use the reset_index method. By default, this method will return a DataFrame, moving the current index into a new column:

Press + to interact
print(city2.reset_index())

To drop the current index ...