Conversion Methods

Learn how to perform data type conversions on a Series.

Sometimes we’ll need to change the type of the data. This might be due to formats that don’t include type information, or it might be that we can have better performance (more manipulation options or less memory usage) by changing types.

Automatic conversion

In pandas 1.0, a new conversion method was introduced, namely convert_dtypes. This tries to convert a Series to a type that supports pd.NA. In the case of our city_mpg Series, it will change the type from int64 to Int64 like this:

Press + to interact
print(city_mpg.convert_dtypes())

We found that convert_dtypes is a little too magical for us. We prefer more explicit control over what happens to our data.

To specify a type for a Series, we can try to use the astype method. Our city mileage can be held in a 16-bit integer. However, an 8-bit integer won’t work, since the maximum value for that signed type is 127 and we have some cars with a value of 150:

Press + to interact
print(city_mpg.astype('Int16'))

Now, try the code below, and we should see an error because the maximum signable value is 127:

Press + to interact
print(city_mpg.astype('Int8'))

Using the correct type can save significant ...