GroupBy—Aggregate

Learn how to apply a wide range of aggregate functions on grouped data.

Applying functions to groups

After splitting our data into the desired groups, the next step is to independently apply a function to each group of data. This step allows us to implement various techniques in each group to get different outputs based on our data requirements.

Beyond performing the common mean and sum aggregations on grouped data, we can apply various other functions to the groups. These functions can be classified into three categories:

  • Aggregate: Calculate a summary statistic for each group, e.g., compute mean and standard deviation values.

  • Transform: Perform group-wise computations and return an output object that retains the same shape, e.g., scale the values in each group based on each group’s min-max values.

  • Filter: Remove groups based on a group-wise computation that gives a boolean (True/ False) output, e.g., discard groups with a mean value below a certain threshold we define.

We’ll continue working with the credit card dataset for the following examples.

Preview of the Credit Card Dataset

ID

Income

Limit

Rating

Cards

Age

Education

Gender

Student

Married

Ethnicity

Balance

1

14.891

3606

283

2

34

11

Male

No

Yes

Caucasian

333

2

106.025

6645

483

3

82

15

Female

Yes

Yes

Asian

903

3

104.593

7075

514

4

71

11

Male

No

No

Asian

580

4

148.924

9504

681

3

36

11

Female

No

No

Asian

964

5

55.882

4897

357

2

68

16

Male

No

No

Caucasian

331

Note: The “ID" column, whose values start with 1, refers to the identification numbers of customers. It is different from the actual index of a DataFrame that begins with 0. ...