I have a dataframe of 1100 rows with moving data: things like origin cities and countries as well as destination cities and countries.
The process I'm working through involves taking city names (eg: Portland, Oregon) and sending them to the Nominatim search page (https://nominatim.openstreetmap.org/search/) to pull out the latitude and longitude.
I found a pretty good one-off example on Stackoverflow:
import requests
import urllib.parse
address = 'Portland, Oregon'
url = 'https://nominatim.openstreetmap.org/search/' + urllib.parse.quote(address) +'?format=json'
response = requests.get(url).json()
print(response[0]["lat"])
print(response[0]["lon"])
This works great even when I have non-city entries (eg: Texas, United States or Bavaria, Germany).
The issue I'm running into now is that I can't quite get the code to run down my list of locations in my dataframe column and pull out the info I need.
Here is my code:
segment1 = 'https://nominatim.openstreetmap.org/search/'
segment3 = '?format=json'
df1['json_location_data'] = df1.apply(lambda x: requests.get(segment1 + urllib.parse.quote(str(df1['Origin'])) + segment3).json())
I'm getting an error that reads:
ValueError: Expected a 1D array, got an array with shape (1100, 17)
Not sure how to fix this error, so I created a reproducible example here:
import pandas as pd
locations = ['Portland, Oregon', 'Seattle, Washington','New York, New York','Texas, United States']
df = pd.DataFrame(locations, columns=['locations'])
segment1 = 'https://nominatim.openstreetmap.org/search/'
segment3 = '?format=json'
df['json_location_data'] = df.apply(lambda x: requests.get(segment1 + urllib.parse.quote(str(df['locations'])) + segment3).json())
This works without producing any errors, but returns a column with all NAs.
How can I solve this issue and get the desired data?
Here's a version that works. Note that I'm extracting only the lat and long from the rather large structure that gets returned.
Output: