Line Distance Measurement

Discover how to efficiently calculate distances and create insightful plots using geopy and Plotly.

Once we know how to import, filter, derive extra features from coordinates, and plot the locations on the map, we can focus on how to calculate the distance between the stores.

Note: Please zoom in on the dots in the output of the widgets below by clicking the plus icon in the top right corner to view the locations clearly.

import pandas as pd
import plotly.graph_objects as go 
import os
df = pd.read_csv('MoscowMcD.csv')
# Plot the data
fig = go.Figure(data=go.Scattergeo(
        lon = df['lon'],
        lat = df['lat'],
        text = df['Store'],
        mode = 'markers',
        marker_color = 'red',
        ))

fig.update_layout(
        title = 'Stores in Moscow<br>(Zoom and hover for store names)',
        geo_scope='europe',
    )

fig.write_html("index.html")
An interactive map of the stores

Since we know how to plot store locations on a map, it’s easy to draw lines between them. The add_trace command connects the store locations using lines as mode with width set to 3 and color set to red.

import pandas as pd
import plotly.graph_objects as go
df = pd.read_csv('MoscowMcD.csv')
longitude = df['lon']
latitude = df['lat']
color = df['lon']

fig = go.Figure()

# The go.Scattergeo function is used to create a scatter plot of the stores on a map
fig.add_trace(go.Scattergeo(
    # The locationmode parameter is set to 'ISO-3', which specifies that the coordinates are in latitude and longitude format
    locationmode = 'ISO-3', 
    lon = df['lon'],
    lat = df['lat'],
    # Text parameter is set to the names of the stores
    hoverinfo = 'text',
    text = df['Store'],
     # The mode parameter is set to 'markers', which specifies that each store is represented by a marker on the map
    mode = 'markers',
    marker = dict(
        size = 4,
        color = 'rgb(255, 0, 0)',
        line = dict(
            width = 3,
            color = 'rgba(68, 68, 68, 0)'
        )
    )))
# The for loop is used to create lines connecting pairs of stores on the map
for i in range(len(df)): 
    fig.add_trace(
        # For each pair of stores, a go.Scattergeo object is created 
        go.Scattergeo( 
            locationmode = 'ISO-3',
            lon = [df['lon'][1], df['lon'][2]],
            lat = [df['lat'][1], df['lat'][2]],
            mode = 'lines',
            line = dict(width = 1,color = 'red'),
            # The opacity parameter is set based on the longitude coordinate of the first store, which results in the lines being more transparent for stores with lower longitude values
            opacity = float(df['lon'][i]) / float(df['lon'].max()),
        )
    )
# Finally, the update_layout function is used to specify the title of the plot, the map projection, and the colors of the land and country borders
fig.update_layout( 
    title_text = 'Stores in Moscow<br>(Zoom and hover for store names)',
    showlegend = False,
    geo = dict(
        scope = 'europe',
        projection_type = 'azimuthal equal area',
        showland = True,
        landcolor = 'rgb(243, 243, 243)',
        countrycolor = 'rgb(204, 204, 204)',
    ),
)

fig.write_html("index.html")
The add_trace command connects the store locations with a line

But does it make sense to draw lines between all stores?

Even though we could continue connecting all stores for one specific route to give us a visual representation of the route, the problem is that we can’t easily convert these lines into distance metrics, such as miles or kilometers. Let’s find another way to calculate the distances between stores.

import plotly.graph_objects as go
import pandas as pd
import io
import requests
import plotly.graph_objects as go
df=pd.read_csv("MoscowMcD.csv")

longitude = df['lon']
latitude = df['lat']
color = df['lon']

fig = go.Figure()

fig.add_trace(go.Scattergeo(
    locationmode = 'ISO-3',
    lon = df['lon'],
    lat = df['lat'],
    hoverinfo = 'text',
    text = df['Store'],
    mode = 'markers',
    marker = dict(
        size = 4,
        color = 'rgb(255, 0, 0)',
        line = dict(
            width = 3,
            color = 'rgba(68, 68, 68, 0)'
        )
    )))

flight_paths = []
for i in range(len(df)):
    fig.add_trace(
        go.Scattergeo(
            locationmode = 'ISO-3',
            lon = [df['lon'][0], df['lon'][1], df['lon'][2], df['lon'][3], df['lon'][4], df['lon'][5], df['lon'][6], df['lon'][7], df['lon'][8], df['lon'][9]
                   , df['lon'][10], df['lon'][11], df['lon'][12], df['lon'][0]],
            lat = [df['lat'][0], df['lat'][1], df['lat'][2], df['lat'][3], df['lat'][4], df['lat'][5], df['lat'][6], df['lat'][7], df['lat'][8], df['lat'][9]
                   , df['lat'][10], df['lat'][11], df['lat'][12],df['lat'][0]],
            mode = 'lines',
            line = dict(width = 1,color = 'red'),
            opacity = float(df['lon'][i]) / float(df['lon'].max()),
        )
    )

fig.update_layout(
    title_text = 'Stores in Moscow<br>(Zoom and hover for store names)',
    showlegend = False,
    geo = dict(
        scope = 'europe',
        projection_type = 'azimuthal equal area',
        showland = True,
        landcolor = 'rgb(243, 243, 243)',
        countrycolor = 'rgb(204, 204, 204)',
    ),
)

fig.write_html("index.html")
One possible closed route

First of all, we have to decide which distance measure we want to choose. We can choose between surface and straight-line distance. Surface distance is the actual distance on the ground that must be covered when moving across the landscape. Great-circle (also called geodesic) distance is one such measure. Euclidean (straight-line) distance, on the other hand, doesn’t account for the curvature of the earth or obstacles such as mountains, rivers, and other geographical features. Compared to the straight-line distance, the surface distance is greater because the ups and downs of the actual ground surface are taken into account.

One-way distance between two locations

To start with, we select two locations from the dataset. With the isin function, we select the elements in the DataFrame that contain StoreA or StoreB.

Press + to interact
import pandas as pd
df = pd.read_csv('MoscowMcD.csv')
StoreAB = df[df["Store"].isin(["StoreA","StoreB"])].reset_index()
print(StoreAB)

Regularly checking the intermediate results ensures avoidable errors are avoided, so let's take a quick look at the DataFrame in the output of the widget above.

In this case, the ...