Line Distance Measurement
Discover how to efficiently calculate distances and create insightful plots using geopy and Plotly.
Once we know how to import, filter, derive extra features from coordinates, and plot the locations on the map, we can focus on how to calculate the distance between the stores.
Note: Please zoom in on the dots in the output of the widgets below by clicking the plus icon in the top right corner to view the locations clearly.
import pandas as pd import plotly.graph_objects as go import os df = pd.read_csv('MoscowMcD.csv') # Plot the data fig = go.Figure(data=go.Scattergeo( lon = df['lon'], lat = df['lat'], text = df['Store'], mode = 'markers', marker_color = 'red', )) fig.update_layout( title = 'Stores in Moscow<br>(Zoom and hover for store names)', geo_scope='europe', ) fig.write_html("index.html")
Since we know how to plot store locations on a map, it’s easy to draw lines between them. The add_trace
command connects the store locations using lines as mode
with width
set to 3
and color
set to red
.
import pandas as pd import plotly.graph_objects as go df = pd.read_csv('MoscowMcD.csv') longitude = df['lon'] latitude = df['lat'] color = df['lon'] fig = go.Figure() # The go.Scattergeo function is used to create a scatter plot of the stores on a map fig.add_trace(go.Scattergeo( # The locationmode parameter is set to 'ISO-3', which specifies that the coordinates are in latitude and longitude format locationmode = 'ISO-3', lon = df['lon'], lat = df['lat'], # Text parameter is set to the names of the stores hoverinfo = 'text', text = df['Store'], # The mode parameter is set to 'markers', which specifies that each store is represented by a marker on the map mode = 'markers', marker = dict( size = 4, color = 'rgb(255, 0, 0)', line = dict( width = 3, color = 'rgba(68, 68, 68, 0)' ) ))) # The for loop is used to create lines connecting pairs of stores on the map for i in range(len(df)): fig.add_trace( # For each pair of stores, a go.Scattergeo object is created go.Scattergeo( locationmode = 'ISO-3', lon = [df['lon'][1], df['lon'][2]], lat = [df['lat'][1], df['lat'][2]], mode = 'lines', line = dict(width = 1,color = 'red'), # The opacity parameter is set based on the longitude coordinate of the first store, which results in the lines being more transparent for stores with lower longitude values opacity = float(df['lon'][i]) / float(df['lon'].max()), ) ) # Finally, the update_layout function is used to specify the title of the plot, the map projection, and the colors of the land and country borders fig.update_layout( title_text = 'Stores in Moscow<br>(Zoom and hover for store names)', showlegend = False, geo = dict( scope = 'europe', projection_type = 'azimuthal equal area', showland = True, landcolor = 'rgb(243, 243, 243)', countrycolor = 'rgb(204, 204, 204)', ), ) fig.write_html("index.html")
But does it make sense to draw lines between all stores?
Even though we could continue connecting all stores for one specific route to give us a visual representation of the route, the problem is that we can’t easily convert these lines into distance metrics, such as miles or kilometers. Let’s find another way to calculate the distances between stores.
import plotly.graph_objects as go import pandas as pd import io import requests import plotly.graph_objects as go df=pd.read_csv("MoscowMcD.csv") longitude = df['lon'] latitude = df['lat'] color = df['lon'] fig = go.Figure() fig.add_trace(go.Scattergeo( locationmode = 'ISO-3', lon = df['lon'], lat = df['lat'], hoverinfo = 'text', text = df['Store'], mode = 'markers', marker = dict( size = 4, color = 'rgb(255, 0, 0)', line = dict( width = 3, color = 'rgba(68, 68, 68, 0)' ) ))) flight_paths = [] for i in range(len(df)): fig.add_trace( go.Scattergeo( locationmode = 'ISO-3', lon = [df['lon'][0], df['lon'][1], df['lon'][2], df['lon'][3], df['lon'][4], df['lon'][5], df['lon'][6], df['lon'][7], df['lon'][8], df['lon'][9] , df['lon'][10], df['lon'][11], df['lon'][12], df['lon'][0]], lat = [df['lat'][0], df['lat'][1], df['lat'][2], df['lat'][3], df['lat'][4], df['lat'][5], df['lat'][6], df['lat'][7], df['lat'][8], df['lat'][9] , df['lat'][10], df['lat'][11], df['lat'][12],df['lat'][0]], mode = 'lines', line = dict(width = 1,color = 'red'), opacity = float(df['lon'][i]) / float(df['lon'].max()), ) ) fig.update_layout( title_text = 'Stores in Moscow<br>(Zoom and hover for store names)', showlegend = False, geo = dict( scope = 'europe', projection_type = 'azimuthal equal area', showland = True, landcolor = 'rgb(243, 243, 243)', countrycolor = 'rgb(204, 204, 204)', ), ) fig.write_html("index.html")
First of all, we have to decide which distance measure we want to choose. We can choose between surface and straight-line distance. Surface distance is the actual distance on the ground that must be covered when moving across the landscape. Great-circle (also called geodesic) distance is one such measure. Euclidean (straight-line) distance, on the other hand, doesn’t account for the curvature of the earth or obstacles such as mountains, rivers, and other geographical features. Compared to the straight-line distance, the surface distance is greater because the ups and downs of the actual ground surface are taken into account.
One-way distance between two locations
To start with, we select two locations from the dataset. With the isin
function, we select the elements in the DataFrame that contain StoreA
or StoreB
.
import pandas as pddf = pd.read_csv('MoscowMcD.csv')StoreAB = df[df["Store"].isin(["StoreA","StoreB"])].reset_index()print(StoreAB)
Regularly checking the intermediate results ensures avoidable errors are avoided, so let's take a quick look at the DataFrame in the output of the widget above.
In this case, the ...