Stack and Unstack
Understand how to use pandas stack and unstack methods to reshape MultiIndex DataFrames by pivoting column and row levels. Learn to handle missing data and customize operations with parameters like level, dropna, and fill_value for cleaner, more flexible data analysis.
We'll cover the following...
Overview of stacking and unstacking
The pandas methods stack() and unstack() are similar conceptually to pivot() and melt(). Given the similarity, when do we use stack() and unstack()? The two methods stack() and unstack() are primarily designed to be used on MultiIndex DataFrames, and allow us to reshape so that the indexes are modified at the row or column levels that we may specify.
That said, both stack() and unstack() can still be used for single-indexed DataFrames, although the use of pivot() and melt() is more mainstream and intuitive. To recap, a MultiIndex (also known as hierarchical-indexed) DataFrame is one with multiple columns acting as row identifiers or multiple rows acting as column identifiers.
In the image above, we see a MultiIndex DataFrame where the store_name and fruit_name indexes serve as row identifiers and the year level (2021 and 2022) and column level (unit_price, unit_cost, and kg) indexes serve as the column identifiers.
Stack
The stack() method reshapes a DataFrame by pivoting a level of the column labels (i.e., column names outside the index). Suppose ...