Traversing Data

Follow step-by-step instructions to calculate the distance between all locations with OpenStreetMap.

We calculated the distances between two places. However, we need to calculate the distances between all locations to be able to calculate the shortest total distance. Since we don’t want to repeat the previous step manually for all locations, we’ll solve it via a loop instead.

Distance between all locations

To get started, let’s first calculate the distance between three stores: A, B, and C. With islice, we iterate over the df.iterrows() iterable, which starts at counterFixed. Using df.index != i, we ensure that the distance from any store to itself is not calculated.

Press + to interact
import pandas as pd
import requests
import json
from itertools import islice
df = pd.read_csv('MoscowMcD.csv')
df = df[df["Store"].isin(["StoreA","StoreB","StoreC"])].reset_index()
counterFixed = 0
toggle = 0
def get_distance(point1: dict, point2: dict):
url = f"""http://router.project-osrm.org/route/v1/car/{point1["lon"]},{point1["lat"]};{point2["lon"]},{point2["lat"]}?overview=false&alternatives=false"""
r = requests.get(url)
# Get the distance from the returned values
route = json.loads(r.content)["routes"][0]
return (route["distance"], route["duration"])
# For three stores there are 3*2 combinations because we don’t need the distance between one store and itself
listDist = []
for i, r in islice(df.iterrows(), counterFixed, None):
point1 = {"lat": r["lat"], "lon": r["lon"]}
for j, o in df[df.index != i].iterrows():
point2 = {"lat": o["lat"], "lon": o["lon"]}
dist, duration = get_distance(point1, point2)
listDist.append((i, j, duration, dist))
toggle = 1
distancesDf = pd.DataFrame(listDist, columns=["From", "To", "Duration(s)", "Distance(m)"])
distancesDf = distancesDf.merge(df[["Store"]], left_on = "From", right_index=True).rename(columns={"Store":"StartLocation"})
distancesDf = distancesDf.merge(df[["Store"]], left_on = "To", right_index=True).rename(columns={"Store":"Destination"})
print(distancesDf)

Iterate and calculate

The loop grabs the first three stores and calculates the distances from one store to each of the two other stores.

Press + to interact
for i, r in islice(df.iterrows(), counterFixed, None):
point1 = {"lat": r["lat"], "lon": r["lon"]}
for j, o in df[df.index != i].iterrows():
point2 = {"lat": o["lat"], "lon": o["lon"]}
dist, duration = get_distance(point1, point2)
listDist.append((i, j, duration, dist))
toggle = 1

The result will be packed into a DataFrame with the From, To, Duration(s), and Distance(m) column headers. At this point, we have the index numbers of these stores. To also receive the store names (like StoreA, etc.), we can combine this DataFrame with the original to ...