Home/Blog/Data Science/Introduction to Choropleth Maps
Home/Blog/Data Science/Introduction to Choropleth Maps

Introduction to Choropleth Maps

5 min read
Oct 27, 2023
content
Geospatial analysis
Geopandas library
Choropleth maps
Categorical choropleth map
Case Study: Visualizing the global COVID-19 vaccination rate
Conclusion

Become a Software Engineer in Months, Not Years

From your first line of code, to your first day on the job — Educative has you covered. Join 2M+ developers learning in-demand programming skills.

A choropleth map displays various data in different regions to visualize geographical data on a region-by-region basis. Due to significant progress in geospatial analysis, the market now offers various map plotting techniques. In this blog, we will study choropleth maps and plot them using the GeoPandas Python library.

Geospatial analysis#

Geospatial analysis examines, interprets, and manipulates geographic data and information associated with specific locations on the Earth’s surface. It consists of various techniques and methodologies for understanding spatial patterns, relationships, and trends within geographic data.

Geopandas library#

GeoPandas is an open-source Python library that extends the capabilities of pandas, a widely used data manipulation library, to handle geospatial data more efficiently. It provides a user-friendly and powerful interface for working with geospatial datasets.

GeoPandas helps read, write, visualize, and analyze geographic data in various formats, such as shapefiles, GeoJSON, Geospatial Data Abstraction Library (GDAL) formats, and more. GeoPandas allows seamless integration with data analysis workflows, making combining geospatial data with non-spatial data and performing complex spatial operations easier.

We can easily use this library with just one command:

pip install geopandas

Let’s learn about choropleth maps, their types, and their advantages.

Choropleth maps#

Choropleth originates from combining two Greek words: choros, which signifies region, and plethos, which means multitude. This type of map displays various data in different regions to visualize geographical data on a region-by-region basis. These maps represent data using different colors or shading patterns to indicate the variation in a specific variable across geographic areas, such as countries, states, provinces, counties, or other administrative divisions.

Let’s create our first choropleth map in Python:

import geopandas as gpd
import matplotlib.pyplot as plt
world_map = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
fig, ax = plt.subplots(figsize=(10, 6))
variable = "pop_est"
cmap = "viridis"
world_map.plot(column=variable, cmap=cmap, linewidth=0.8, ax=ax, edgecolor='0.8', legend=True)
ax.set_ylabel('Latitude')
ax.set_xlabel('Longitude')
plt.show()
  • Line 1: We import the geopandas library.

  • Line 2:  We import the pyplot module from the matplotlib library, which is used for creating plots and visualizations.

  • Line 4:  We retrieve the file path for the built-in Natural Earth dataset called naturalearth_lowres in geopandas to the variable world_map which contains low-resolution geometries and attributes of countries. The gpd.read_file method reads the data from the file specified in the argument (in this case, the Natural Earth dataset). It returns a GeoDataFrame, a specialized data structure in GeoPandas for handling geospatial data.

  • Line 6: We define the variable to be displayed on the choropleth map. In this case, it is set to "pop_est", representing the population estimate attribute in the dataset.

  • Line 7:  We define the colormap to be used for the choropleth map. The term "viridis" represents the perceptually uniform sequential colormap.

  • Line 8: We plot the choropleth map using the GeoDataFrame world_map.

    • column=variable: Specifies the column in the GeoDataFrame that contains the data to be visualized. Here, it is set to the value of the variable, which is "pop_est".

    • cmap=cmap: Sets the colormap for the choropleth map.

    • linewidth=0.8: Sets the width of the boundary lines between the polygons on the map.

    • ax=ax: Specifies the axes on which to plot the map. In this case, it uses the ax created earlier in Line 5 with plt.subplots.

    • edgecolor='0.8': Sets the color of the boundary lines between the polygons.

Here is the result of the aforementioned code:

World choropleth map—population estimate
World choropleth map—population estimate

Categorical choropleth map#

To create a categorical choropleth map with legends, we’ll use a modified version of the world map data that contains categorical data, such as regions or categories for different countries. For this example, we’ll use the Natural Earth dataset again, but create a new column called Category to represent the categories for each country.

Here’s the code to create a categorical choropleth map with legends:

import geopandas as gpd
import matplotlib.pyplot as plt
world_map = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
categories = {
'United States': 'North America',
'Canada': 'North America',
'Brazil': 'South America',
'China': 'Asia',
'India': 'Asia',
'Australia': 'Oceania',
'France': 'Europe',
'South Africa': 'Africa',
}
world_map['Category'] = world_map['name'].map(categories)
fig, ax = plt.subplots(figsize=(20, 10))
variable = 'Category'
cmap = 'Set1'
world_map.plot(column=variable, cmap=cmap, linewidth=0.8, ax=ax, edgecolor='0.8', legend=True)
ax.set_ylabel('Latitude')
ax.set_xlabel('Longitude')
plt.show()
  • Lines 5–14: We define a dictionary named categories. It maps country names to their respective categories or regions.

  • Line 16: We add a new column Category to the world_map GeoDataFrame. It maps the values in the name column (country names) to the categories defined in the categories dictionary and assigns the corresponding category to each country.

The output of this code will be the following map:

World categorical choropleth map
World categorical choropleth map

Remember, these maps are versatile tools that can uncover insights with just a glance and can be employed in a wide range of real-world scenarios.

Let's explore an application scenario for choropleth maps.

Case Study: Visualizing the global COVID-19 vaccination rate#

In response to the COVID-19 pandemic, governments worldwide launched vaccination campaigns to curb the spread of the virus. To assess the progress of these campaigns, we are required to generate a choropleth map that visualizes the hypothetical COVID-19 vaccination rates by country. Let’s have a look at the code:

import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Create dummy vaccination data for all countries
np.random.seed(42) # For reproducibility
vaccination_data = {
'Country': world['name'],
'Vaccination Rate (%)': np.random.randint(0, 100, len(world)) # Generate random values
}
vaccination_df = pd.DataFrame(vaccination_data)
merged_data = world.merge(vaccination_df, left_on='name', right_on='Country')
fig, ax = plt.subplots(figsize=(12, 8))
variable = 'Vaccination Rate (%)'
cmap = 'YlGnBu'
merged_data.plot(column=variable, cmap=cmap, linewidth=0.8, ax=ax, edgecolor='0.8', legend=True)
ax.set_ylabel('Latitude')
ax.set_xlabel('Longitude')
plt.show()
  • Line 10: We set the random seed for reproducibility, ensuring that random numbers generated are consistent across runs.

  • Lines 11–14: A dictionary is created containing two keys: Country and Vaccination Rate (%). The Country key gets values from the name column of the world DataFrame. The Vaccination Rate (%) and key is populated with random integer values (between 0 and 100) generated using np.random.randint() for the same length as the world DataFrame.

The output of this code will be the following map:

Hypothetical COVID-19 vaccination rate by country
Hypothetical COVID-19 vaccination rate by country

Conclusion #

This blog has introduced geospatial analysis, GeoPandas library, and the exciting world of choropleth maps. Our exploration dived deeper into the details of choropleth maps and explored the diverse scenarios in which they apply.

If you want to learn more about choropleth maps, look no further! Check out the exciting new courses available on the Educative platform:

  1. Interactive Dashboards and Data Apps with Plotly and Dash

  2. Introduction to Data Science with Python

  3. Introduction to Geospatial Analysis with Python and GeoPandas


Written By:
Nimra Zaheer
Join 2.5 million developers at
Explore the catalog

Free Resources