Traversing Data
Follow step-by-step instructions to calculate the distance between all locations with OpenStreetMap.
We'll cover the following...
We calculated the distances between two places. However, we need to calculate the distances between all locations to be able to calculate the shortest total distance. Since we don’t want to repeat the previous step manually for all locations, we’ll solve it via a loop instead.
Distance between all locations
To get started, let’s first calculate the distance between three stores: A, B, and C. With islice
, we iterate over the df.iterrows()
iterable, which starts at counterFixed
. Using df.index
!= i
, we ensure that the distance from any store to itself is not calculated.
import pandas as pdimport requestsimport jsonfrom itertools import islicedf = pd.read_csv('MoscowMcD.csv')df = df[df["Store"].isin(["StoreA","StoreB","StoreC"])].reset_index()counterFixed = 0toggle = 0def get_distance(point1: dict, point2: dict):url = f"""http://router.project-osrm.org/route/v1/car/{point1["lon"]},{point1["lat"]};{point2["lon"]},{point2["lat"]}?overview=false&alternatives=false"""r = requests.get(url)# Get the distance from the returned valuesroute = json.loads(r.content)["routes"][0]return (route["distance"], route["duration"])# For three stores there are 3*2 combinations because we don’t need the distance between one store and itselflistDist = []for i, r in islice(df.iterrows(), counterFixed, None):point1 = {"lat": r["lat"], "lon": r["lon"]}for j, o in df[df.index != i].iterrows():point2 = {"lat": o["lat"], "lon": o["lon"]}dist, duration = get_distance(point1, point2)listDist.append((i, j, duration, dist))toggle = 1distancesDf = pd.DataFrame(listDist, columns=["From", "To", "Duration(s)", "Distance(m)"])distancesDf = distancesDf.merge(df[["Store"]], left_on = "From", right_index=True).rename(columns={"Store":"StartLocation"})distancesDf = distancesDf.merge(df[["Store"]], left_on = "To", right_index=True).rename(columns={"Store":"Destination"})print(distancesDf)
Iterate and calculate
The loop grabs the first three stores and calculates the distances from one store to each of the two other stores.
for i, r in islice(df.iterrows(), counterFixed, None):point1 = {"lat": r["lat"], "lon": r["lon"]}for j, o in df[df.index != i].iterrows():point2 = {"lat": o["lat"], "lon": o["lon"]}dist, duration = get_distance(point1, point2)listDist.append((i, j, duration, dist))toggle = 1
The result will be packed into a DataFrame with the From
, To
, Duration(s)
, and Distance(m)
column headers. At this point, we have the index
numbers of these stores. To also receive the store names (like StoreA
, etc.), we can combine this DataFrame with the original to ...