Overview of `groupby()`

The groupby() method is a powerful and versatile method that allows us to group rows or columns together based on the criteria we define. It’s often used with aggregate functions (e.g., sum, average, etc.) or transformation functions (e.g., normalization) to summarize results across one or more columns.

The groupby() method is commonly used in real-world data analysis because it helps us create useful high-level data summaries. For example, a bank may want to determine the customers’ average account balances over the last three months. A groupby() operation with a mean aggregation will give us the answer needed.

Press + to interact

A groupby() operation combines the following three steps to group large datasets together and perform computations on them:

Split data into groups based on the criteria we specify.
Apply a function to each group independently.
Combine results in a data structure.

In the retail bank example above, we first split the data based on records for each unique customer and compute the average bank balance for each customer over the three-month period. Then, we return the output as a DataFrame where each row represents a unique customer. The code for the above example—groupby() with mean()—is shown below:

Press + to interact

Before We Begin

Reading Data into pandas

Combining Data

Reshaping and Manipulating Data

Encoding Data Types

Handling Numerical Data

Handling Categorical Data

Handling Text Data

Handling Time Series Data

Handling Sparse Data Structures

Handling Missing Data

Data Analysis and Visualization with sidetable and Bokeh

Leveraging Further Features of pandas

Utilizing Extended Libraries

Wrap Up

Appendix

Time Series Analysis and Visualization Using Python and Plotly

GroupBy—The groupby() method

Overview of `groupby()`

The `groupby()` parameters

Data Analysis and Visualization with sidetable and Bokeh

Time Series Analysis and Visualization Using Python and Plotly

GroupBy—The groupby() method

Overview of groupby()

The groupby() parameters

Overview of `groupby()`

The `groupby()` parameters