Conversion Methods
Learn how to perform data type conversions on a Series.
We'll cover the following...
Sometimes we’ll need to change the type of the data. This might be due to formats that don’t include type information, or it might be that we can have better performance (more manipulation options or less memory usage) by changing types.
Automatic conversion
In pandas 1.0, a new conversion method was introduced, namely convert_dtypes
. This tries to convert a Series to a type that supports pd.NA
. In the case of our city_mpg
Series, it will change the type from int64
to Int64
like this:
print(city_mpg.convert_dtypes())
We found that convert_dtypes
is a little too magical for us. We prefer more explicit control over what happens to our data.
To specify a type for a Series, we can try to use the astype
method. Our city mileage can be held in a 16-bit integer. However, an 8-bit integer won’t work, since the maximum value for that signed type is 127 and we have some cars with a value of 150:
print(city_mpg.astype('Int16'))
Now, try the code below, and we should see an error because the maximum signable value is 127:
print(city_mpg.astype('Int8'))
Using the correct type can save significant ...