How to obtain the cumulative sum over a pandas DataFrame axis

Overview

The cummsum() function of a DataFrame object is used to obtain the cumulative sum over its axis.

Note: Axis here simply represents the row and column of the DataFrame. An axis with a value of 0 indicates the axes running vertically downwards across a row, while a value of 1 indicates the axes running horizontally across a column.

Syntax

DataFrame.cumsum(axis=None, skipna=True, *args, **kwargs)
Syntax for the cumsum() function in pandas

Parameters

  • axis: This represents the name for the row ( designated as 0 or 'index') or the column (designated as 1 or columns) axis.
  • skipna: This takes a boolean value indicating if null values are to be excluded or not. This is an optional parameter.
  • args, **kwargs: These keywords have no effect but may be accepted for compatibility with NumPy. These are optional.

Return value

This function returns a Series or DataFrame object showing the cumulative maximum in the axis.

Example

# A code to illustrate the cumsum() function in Pandas
# importing the pandas library
import pandas as pd
# creating a dataframe
df = pd.DataFrame([[5,10,4,15,3],
[1,7,5,9,0.5],
[3,11,13,14,12]],
columns=list('ABCDE'))
# printing the dataframe
print(df)
# obtaining the cummulative sum vertically across rows
print(df.cumsum(axis="index"))
# obtaining the cummulative sum horizontally over columns
print(df.cumsum(axis="columns"))

Explanation

  • Line 4 : We import the pandas library.
  • Lines 7–10: We create a DataFrame, df.
  • Line 12: We print the DataFrame, df.
  • Line 15: We use the cumsum() function to obtain the cumulative maximum values running downwards across the rows (axis 0). We print the result to the console.
  • Line 18: We use the cumsum() function to obtain the cumulative maximum values running horizontally across columns (axis 1). We print the result to the console.

Free Resources