Indexing Operations
Learn how to manipulate indexes of a Series or a DataFrame.
We'll cover the following...
Indexing is an overloaded term in the pandas world. Both a Series and a DataFrame have an index (the labels down the left side for each row). In addition, both types support the Python indexing operator ([]
). But that’s not all. They both have attributes (loc
and iloc
) that we can index against (using the Python indexing operator).
Prepping the data and renaming the index
To help explain the various operations, we’ll take the automobile mileage data Series with the city miles per gallon values and insert each car’s make as the index. This is because many operations work on the index position while others work on the index label. If these are both integer values, it can be a little confusing, but it becomes clearer if the index has string labels.
We’ll use the rename
method to change the index labels. We can pass in a dictionary to map the previous index label to the new label:
city2 = city_mpg.rename(make.to_dict())print(city2)
The rename
method will return a new Series with the original values but new index labels. If we pass in a scalar value, it will change the name
attribute of the Series in the new Series it returns, leaving the index intact.
To view the index, we can access the index
attribute:
print(city2.index)
The rename
method also accepts a Series, a scalar, and a function that takes an old label and returns a new label or a sequence. When we pass in a Series and the index values are the same, the values from the Series that we passed in are used as the index:
city2 = city_mpg.rename(make)print(city2)
Careful, though! If we pass a scalar value (a single string) into rename
, the index will stay the same, but the name
attribute of the Series will update:
print(city2.rename('citympg'))
Resetting the index
Sometimes we need a unique index to perform an operation. If we want to set the index to monotonic increasing, so that unique integers start at zero, we can use the reset_index
method. By default, this method will return a DataFrame, moving the current index into a new column:
print(city2.reset_index())
To drop the current index ...