How to draw a box plot in Altair

Altair is a Python library for statistical visualization. It provides a declarative interface, allowing users to easily create interactive and static visualizations. It supports various plot types, integrates seamlessly with pandas DataFrames, and offers extensive customization options for creating publication-quality graphics. Altair is widely used for exploratory data analysis, data storytelling, and communicating insights in data science and analytics projects.

The box plot in Altair

A box plot is a graphical representation of the distribution of a continuous variable through its quartiles. It consists of a box representing the interquartile range (IQR) of the data, with a line inside representing the median. Additionally, it often includes whiskers extending from the box, representing the data range and excluding outliers. Box plots are useful for visually summarising the spread, central tendency, and skewness of the data and identifying potential outliers.

Example

We can use the mark_boxplot() function to draw a box plot in Altair. Here’s a basic example:

import altair as alt
import pandas as pd
import os
# Sample data
data = pd.DataFrame({
'category': ['A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C'],
'value': [1, 2, 3, 4, 5, 6, 7, 8, 9]
})
# Create box plot with customized properties
boxplot = alt.Chart(data).mark_boxplot(
color='skyblue', # Set the color of the boxes to sky blue
size=20, # Set the size of the boxes to 20
opacity=0.7 # Set the opacity of the boxes to 0.7
).encode(
x='category:O',
y='value:Q'
)
# Display the chart
boxplot.save('chart.html')
os.system('cat chart.html')

Explanation

  • Lines 1–3: We import Altair and other necessary libraries.

  • Lines 5–9: We create a pandas DataFrame named data with category and value columns.

  • Lines 11–19: We initialize an Altair chart with data. We specify that it’s a box plot (boxplot). We set the color of the boxes in the box plot to skyblue, size to 20, and opacity to 0.7. We map category to the x-axis and value to the y-axis. 'O' and 'Q' represent ordinal and quantitative data types respectively.

  • Line 22: We save the chart using boxplot.save('chart.html'). It exports the chart to an HTML file named chart.html.

  • Line 23: We display the chart on the console.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved