Python’s pandas
library is used to handle regular tabular data, while geopandas
library is an extension of pandas
that provides functions for working with geospatial data, like maps and coordinates.
Key takeaways:
GeoPandas simplifies spatial data processing by allowing easy calculations like area, boundary, and creating various plots such as choropleth and layered maps.
GeoPandas supports multiple data formats, such as JSON and SHP files, making it versatile for handling different types of spatial data.
Spatial data in GeoPandas is stored in GeoDataFrames, which represent geographical shapes like polygons, lines, and points for plotting locations and areas.
CRS (Coordinate Reference System) defines how the two-dimensional, flat map in GeoPandas relates to real places on Earth.The to_crs()
method is used to change the coordinate reference system, ensuring accurate area calculations when creating maps.
GeoPandas’ plot()
function allows you to visualize population density using color maps, boundary lines, and legends for better data representation.
Customized visualizations can be easily created by combining GeoPandas with libraries like Matplotlib for enhanced map clarity and presentation.
The goal of GeoPandas is to make spatial data processing easier in Python. It provides high-level functions such as the calculation of area or boundary and basic, choropleth, layered, or interactive plots for multiple geometries and shapes. These capabilities are particularly valuable for visualizing population density, a key factor in urban planning, resource allocation, and understanding demographic patterns. For example, such visualizations help identify overpopulated areas needing infrastructure improvements or sparsely populated regions where resources may be underutilized. GeoPandas can read multiple data formats such as JSON
or SHP
files. It reads spatial data in the form of GeoSeries or GeoDataFrames representing complex polygons, linestrings, and points to plot geographical areas, paths, or locations.
GeoPandas also enables the projection of geographical data in different coordinate systems (CRS), which is an integral part of spatial data processing. To give you a flavor of plots in GeoPandas, here is a sample boundary plot created by boundary.plot()
method of GeoPandas.
To create a world population density map, we will use a GeoJSON file containing global population data as a dataset. The dataset contains various attributes for each region, including:
NAME
: The name of the country or region.
ISO_3_CODE
and ISO_2_CODE
: ISO country codes for identifying regions.
AREA
: The total area of the region.
NAME_1
: Subregion or additional name information.
POP2005
: Population estimates from the year 2005.
REGION
: The larger region to which the country belongs.
GMI_CNTRY
: Country classification based on Gross National Income (GNI).
NAME_12
: Another naming variant.
geometry
: Geospatial data representing the region’s boundaries.
It is pertinent to note that polygons can consist of hundreds of points and multiple polygons. A snapshot of geometry
column is shown below.
Since the DataFrame already contains the population attribute, the area is to be calculated to find the population density of all the countries. To calculate the accurate area of countries, the projection of geometry is to be converted into EPSG:6933
The following code calculates and plots population density using GeoPandas.
# Import relevant librariesimport geopandas as gpdimport matplotlib.pyplot as plt# Read and process datasetworld_pop = gpd.read_file('https://raw.githubusercontent.com/MinnPost/simple-map-d3/master/example-data/world-population.geo.json')world_pop['POP2005']=world_pop['POP2005'].astype(float)world_pop['area']=world_pop.to_crs(6933).area.astype(float)*0.000001world_pop['density'] = (world_pop['POP2005'].div(world_pop['area']))world_pop.head()# Create population density mapplt.title('World Population Density Map')world_pop.plot(cmap='Blues',linewidth=0.2, scheme='quantiles',edgecolor='gray',column='density',legend=True,figsize=(10, 10),legend_kwds={"loc": "center left", "bbox_to_anchor": (1, 0.5)},)
Let’s understand the code above:
Lines 1–3: Import geopandas
, matplotlib
.
Lines 6–7: Read the data using read_file()
method of GeoPandas and finally convert the POP2005
column having world population in the year 2005 to float
data type.
Line 8: Use to_crs()
and area()
methods of GeoPandas to calculate areas of countries after projecting the CRS. It also converts the area from m2 to km2.
Lines 9–10: Calculate density by dividing the population by area, it also shows the first few rows of the DataFrame.
Line 13: Add title to the graph using plt.title()
.
Lines 14–17: Create the density plot using plot()
method of GeoPandas. It uses the following arguments:
cmap
represents a color map.
linewidth
sets the width of boundaries between countries.
scheme
divides the density attribute into different intervals.
legend
is set to True
to include the legend in the figure.
figsize()
is needed to define the size of the figure.
legend_kwds
is used for defining the location and size of the legend.
Note: You can practice the above code in the code playground below. Press the "Run" button and wait for the output tab to show the Jupyter Notebook. Alternatively, you can click the link beside the "Run" button to open the respective Jupyter Notebook in a new tab.
Become a Data Analyst with our comprehensive learning path!
If you're ready to kickstart your career as a data analyst, then our Become a Data Analyst path is designed to take you from your first line of code to landing your first job.
Whether you’re a beginner or looking to transition into a data-driven career, this step-by-step journey will equip you with the skills to turn raw data into actionable insights. Develop expertise in data cleaning, analysis, and storytelling to make informed decisions and drive business success. With our AI mentor by your side, you’ll tackle challenges with personalized guidance. Start your data analytics career today and make your mark in the world of data!
Haven’t found what you were looking for? Contact Us
Free Resources