The GroupBy Method
Let’s learn about grouping DataFrames using the groupby method.
The groupby
method is one of the most important functionalities in pandas. It allows us to group data, call aggregate functions and combine the results in three steps, namely split, apply, and combine.
Before we move on to learning this hands-on, let’s understand how the three steps—split, apply, and combine—work on data that is in different colors.
- Split: In this process, the data contained in a pandas object (for example, Series or DataFrame) is split into groups based on one or more keys that we provide. The splitting is performed on a particular axis of an object. For example, a DataFrame can be grouped on its rows (
axis=0
) or columns (axis=1
). - Apply: Once splitting is done, a function is used for all groups independently, producing a new value.
- Combine: Finally, the results of all the function applications are combined into a resultant object. The form of the resulting object will usually depend on what’s being done to the data.
The figure below would be helpful to conceptualize the concept of GroupBy operation!
Get hands-on with 1400+ tech skills courses.