Stacking and Flattening Data
Explore stacking, unstacking, and flattening data.
We'll cover the following...
Unstacking
We’ve used the unstack
method previously but not discussed it. It’s a powerful method (along with its complement, stack
) to reshape our data.
At a high level, unstack
moves an index level into the columns. Usually, we use this operation on multi-index data, moving one of the indices into the columns (creating hierarchical columns).
The stack
method does the reverse, moving a multilevel column into the index.
import pandas as pdurl = 'https://github.com/mattharrison/datasets/raw/master/data/'\'2020-jetbrains-python-survey.csv'jb = pd.read_csv(url)jb2 = tweak_jb(jb)print(jb2)
We’ll create a hierarchical index or multi-index by grouping with multiple columns. Let’s take the size of responses to are_you_datascientist
column by country:
print(jb2.groupby(['country_live', 'are_you_datascientist']).size())
Notice that the result is a series with a multi-index. This result is useful but a little hard to scan through. It would be easier if we had countries in the index and each of the responses to ...