...

/

Pandas DataFrame Operations - Grouping and Sorting

Pandas DataFrame Operations - Grouping and Sorting

We'll cover the following...

6. Grouping

Things start looking really interesting when we group rows with certain criteria and then aggregate their data.

Say we want to group our dataset by director and see how much revenue (sum) each director earned at the box-office and then also look at the average rating (mean) for each director. We can do this by using thegroupby operation on the column of interest, followed by the appropriate aggregate (sum/mean), like so:

Press + to interact
# Let’s group our dataset by director and see how much revenue each director has
movies_df.groupby('Director').sum()
# Let’s group our dataset by director and see the average rating of each director
movies_df.groupby('Director')[['Rating']].mean()
widget

As we can see, Pandas grouped all the ‘Director’ rows by name into one. And since we used sum() for aggregation, it added together all the numerical columns. The values for each of the columns now represent the sum of values in that column for that director.

For example, we can see that the director Aamir Khan ...