How to create new pandas column while checking for NoneType

76 Views Asked by At

I am working on creating a dataframe of location data using Nominatim to pull out longitude and latitude from city names. Here is my code so far:

from geopy.geocoders import Nominatim
import pandas as pd

geolocator = Nominatim(user_agent="measurements", timeout=3)
locations = ['Portland, Oregon', 'Seattle, Washington','New York, New York','Columbia, South Carolina']
df = pd.DataFrame(locations, columns=['locations'])

df['latlon'] = df['locations'].apply(lambda x: geolocator.geocode(x, addressdetails=True, language='en'))

I want to pull out the latitude and longitude from the 2nd column but have been having issues parsing the Location(...) data properly. So, I've edited my code above to parse it more directly:

df['lat'] = df['locations'].apply(lambda x: geolocator.geocode(x, addressdetails=True, language='en').latitude)
df['lon'] = df['locations'].apply(lambda x: geolocator.geocode(x, addressdetails=True, language='en').longitude)

This all works for my reproducible example above, but I am running into a problem where I get the following error:

AttributeError: 'NoneType' object has no attribute 'latitude'

How can I write the second chunk of code above to have a check for the data type "NoneType" and then continue on evaluating the lambda expression if NoneType is found?

2

There are 2 best solutions below

2
lant On BEST ANSWER

When geolocator.geocode result is None can cause this problem.Pre emptiness judgment.

def extract_coordinates(location):
geocode_result = geolocator.geocode(location, addressdetails=True, language='en')
if geocode_result is not None:
    latitude = geocode_result.latitude
    longitude = geocode_result.longitude
    return latitude, longitude
else:
    return None, None


df[['latitude', 'longitude']] = 
df['locations'].apply(extract_coordinates).apply(pd.Series)

print(df)
0
Corralien On

You can try to extract latitude and longitude with str accessor:

latlon = pd.DataFrame.from_records(df['latlon'].str[1], columns=['lat', 'lon'])
df = pd.concat([df, latlon], axis=1)

Output:

>>> df
                  locations                    latlon        lat         lon
0          Portland, Oregon  (Portland, Multnomah ...  45.520247 -122.674194
1       Seattle, Washington  (Seattle, King County...  47.603832 -122.330062
2        New York, New York  (New York, United Sta...  40.712728  -74.006015
3  Columbia, South Carolina  (South Carolina, Elm ...  38.889744  -77.040861
4        Nether, Underworld                      None        NaN         NaN