GroupBy—Further Useful Features
Discover the other handy GroupBy features.
Overview of further useful features
Having covered the standard functions across the three categories—aggregate, transform, and filter—let’s explore the other valuable features associated with the groupby()
method. Although these features are less well-known, they can certainly be helpful in numerous use cases and scenarios.
Group selection
We can use the get_group()
method on the GroupBy
object to return rows belonging to a specific group. For example, if we would like to obtain the data for customers who are students (i.e., rows with the Yes
value in the Student
column).
# Generate GroupBy objectgrouped = df.groupby('Student')# Retrieve the student group (i.e., Yes value in Student column)student_group = grouped.get_group(name='Yes')print(student_group)
We can also use get_group()
on GroupBy
objects grouped on multiple columns. For example, if we want to retrieve the group of customers who are students and aren’t married.
# Generate GroupBy objectgrouped = df.groupby(['Student', 'Married'])# Retrieve single students (i.e., Student Yes, Married No)single_student_group = grouped.get_group(name=('Yes', 'No'))print(single_student_group)
If we want to filter the output of the GroupBy
object to specific columns (e.g., Income
and Balance
), we can subset it with the standard square brackets.
# Generate GroupBy objectgrouped = df.groupby(['Student', 'Married'])[['Student', 'Married', 'Income', 'Balance']]# Retrieve single students (i.e., Student Yes, Married No)single_student_group = grouped.get_group(name=('Yes', 'No'))print(single_student_group)
Iterate through groups
A GroupBy
object is a Python generator that we can iterate with. The generator yields a sequence that returns a pair of objects at each iteration—the group name and the subset DataFrame of each group. For example, we can iterate over a GroupBy
object that is grouped based on Ethnicity
to obtain the name of the ...