How to use the cumsum() function on a DataFrame in pandas

What is a DataFrame?

DataFrame is a commonly used 2-dimensional data structure. It is a table with columns and rows and is mostly used as an object in pandas.

DataFrame can be formed as shown below.

A sample dataframe

Requirements

It requires the pandas library as shown below.


import pandas as pd

Code

Example

Below is a DataFrame that contains countries that have been put in different groups and are given a different a_score and b_score.

Both scores are imaginary values for the purpose of this example.

import pandas as pd
a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']
groups = ['A','A','B','A','B','B','C','A','C','C']
df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})
print(df)

cumsum() function

The cumsum() function allows the calculation of the cumulative sum.

Syntax

The function prototype is as follows.


df.cumsum(axis = 1)

Parameter

Any axis whose cumulative sum is to be taken.

Return value

The function returns the cumulative sum.

Example

The following example takes the cumulative sum of the b_score in the DataFrame we formed above.

import pandas as pd
a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil', 'India', 'Beligium', 'Malaysia', 'Peru', 'England', 'Scotland']
groups = ['A','A','B','A','B','B','C','A','C','C']
df = pd.DataFrame({'group':groups, 'country':country, 'a_score':a_score, 'b_score':b_score})
df['cumsum_b'] = df[['b_score','group']].groupby('group').cumsum()
print(df)

Free Resources