What is DataFrame.max in polars?

Polars is a fast and efficient data manipulation library written in Rust. It is designed to provide high-performance operations on large datasets and handles them more quickly than pandas. It is particularly more suitable when working with tabular data.

Importing the library

First, let's import the polars library.

import polars as pl
Importing polars library

The DataFrame.max() method

The DataFrame.max() is a method used to compute the maximum value for each column in a DataFrame. It returns a new DataFrame with a single row that contains the maximum value for each numeric column. The maximum value is calculated independently for each column. This means that each column’s maximum is computed separately, regardless of the values in other columns.

By default, the DataFrame.max() method ignores missing values (null or NaN) during the computation. If a column contains missing values, the maximum value will be computed, excluding those missing values.

Note: The DataFrame.max() method considers only the numeric columns for the computation of the maximum values. Non-numeric columns, such as string or boolean columns, are ignored during the calculation.

Code

import polars as pl
# Create a DataFrame with mixed data types
data = {'A': [1, 2, 3], 'B': [4, None, 6],
'C': [7, 8, 9], 'D': ['foo', 'bar', 'baz']}
df = pl.DataFrame(data)
# Compute the maximum values for each column
max_values = df.max()
print(max_values)

Code explanation

Line 1: We import the polars library as pl.

Lines 46: We create the DataFrame df which contains a mix of numeric and non-numeric columns.

Line 9: We use the DataFrame.max() method that returns the DataFrame containing the maximum values for numeric columns A, B (excluding the missing value None) and C, while ignoring the non-numeric column D.

Line 11: We print the max_values DataFrame that contains the maximum values [3, 6, 9, 'foo'] for the corresponding columns.

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved