I am trying to tease out the dates that I was in a certain area (within a mile or so) using Google Location data and Python Pandas Dataframe.
First convert to latitude from latitudeE7:
with open(Takeout_google_location_history) as f:
data = json.loads(f.read())
df = json_normalize(data['locations'])
df['latitudeE7'] = df['latitudeE7'].div(10000000.0)
df['longitudeE7'] = df['longitudeE7'].div(10000000.0)
df.head()
Then calculate the distance:
import haversine as hs
from haversine import Unit
loc1 = (31.393300,-99.070050)
df['diff'] = hs.haversine(loc1,(df['latitudeE7'],df['longitudeE7']),unit=Unit.MILES)
df.head()
And getting this error:
~\Anaconda2\envs\notebook\lib\site-packages\haversine\haversine.py in
haversine(point1, point2, unit)
92 lat1 = radians(lat1)
93 lng1 = radians(lng1)
---> 94 lat2 = radians(lat2)
95 lng2 = radians(lng2)
96
~\Anaconda2\envs\notebook\lib\site-packages\pandas\core\series.py in wrapper(self)
183 if len(self) == 1:
184 return converter(self.iloc[0])
--> 185 raise TypeError(f"cannot convert the series to {converter}")
186
187 wrapper.__name__ = f"__{converter.__name__}__"
TypeError: cannot convert the series to <class 'float'>
I am not sure what to do with the data to make it a float.
I have tried:
df['latitudeE7'] = df['latitudeE7'].div(10000000.0).astype(float)
As well as using a hand written distance:
import math
def distance(origin, destination):
lat1, lon1 = origin
lat2, lon2 = destination
radius = 6371 # km
dlat = math.radians(float(lat2) - lat1)
dlon = math.radians(float(lon2) - lon1)
a = (math.sin(dlat / 2) * math.sin(dlat / 2) +
math.cos(math.radians(lat1)) * math.cos(math.radians(lat2)) *
math.sin(dlon / 2) * math.sin(dlon / 2))
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
d = radius * c
return d
Still getting the same error:
~\AppData\Local\Temp/ipykernel_22916/3664391511.py in distance(origin, destination)
26 radius = 6371 # km
27
---> 28 dlat = math.radians(float(lat2) - lat1)
29 dlon = math.radians(float(lon2) - lon1)
30 a = (math.sin(dlat / 2) * math.sin(dlat / 2) +
~\Anaconda2\envs\notebook\lib\site-packages\pandas\core\series.py in wrapper(self)
183 if len(self) == 1:
184 return converter(self.iloc[0])
--> 185 raise TypeError(f"cannot convert the series to {converter}")
186
187 wrapper.__name__ = f"__{converter.__name__}__"
TypeError: cannot convert the series to <class 'float'>
You cannot directly pass pd.Series to
haversinefunction.Code:
Output:
Reference:
The issue you have seems related to the following post: understanding math errors in pandas dataframes
[EDIT]
If the number of rows is large,
haversin_vectorwill be the proper method in terms of speed.Code
# Preparation:
# Speed test 1 (Use
haversine)# Speed test 2 (Use
haversine_vector)Reference:
haversine_vector: documenthaversine_vector: implementation