Mastering Data Analysis with Python Pandas/

...

The groupby Function

Learn how to use Pandas groupby function for data analysis.

We'll cover the following...

Data analysis
How to use the groupby function

The lessons up to this point have covered data cleaning, manipulation, and processing with Pandas. Pandas is a great library for data analysis as well. In this chapter, we’ll go over Pandas functions that can be used to analyze tabular data.

Data analysis

Data analysis can be defined as the process of inferring insights, discovering useful information, and drawing results from the data at hand. It’s mainly done to support a decision-making process or to explore the data before creating a machine learning model.

Press + to interact

One of the most commonly used functions in data analysis is the groupby function. It groups observations (rows) according to the distinct values in a given column. Let’s say we have a DataFrame that contains the sales information about the products in a retail store. Each product belongs to a product group, which is indicated in the product_group column. By using the groupby function, we can group the products based on the product groups they belong to. Then, we can calculate a wide range of aggregations, such as average product price, daily total sales, and so on.

Press + to interact

As we see in the output above, once the groups are formed and the mean function is applied, Pandas calculates the mean value for all the numerical columns. So, we’re able to see the average sales quantities as well. Average product code is meaningless because the product code is just used as an identity.

If we’re only interested in the average price, we can select the columns before applying the groupby function. For instance, in line 5 of the following block of code, we first select the product_group and price columns from the grocery and then group the rows by the product_column. Finally, the mean function is applied to see the average price for each product group.

Press + to interact

Course Introduction

Pandas Data Structures

Creating a Data Frame

Exploring a Data Frame

Filtering a Data Frame

String Manipulation with Pandas

Date Manipulation with Pandas

Handling Missing Values with Pandas

Data Analysis with Pandas

Data Visualization with Pandas

Combining DataFrames with Pandas

Data Visualization with Seaborn for Walmart Sales Projection

Final Challenge and Quiz

The groupby Function

Data analysis

How to use the `groupby` function

Data Visualization with Seaborn for Walmart Sales Projection

The groupby Function

Data analysis

How to use the groupby function

How to use the `groupby` function