How to get temperature measure given location and time values in a dataframe?

757 Views Asked by At

I have a pandas dataframe consisting of geo-locations and a time in the past.

location_time = pd.read_csv(r'geo_time.csv')
print (geo_time)

> +---------+---------+---------+-------------------+ 
  | latitude|longitude| altitude|              start|
  +---------+---------+---------+-------------------+ 
  |  48.2393|  11.5713|      520|2020-03-12 13:00:00|
  +---------+---------+---------+-------------------+ 
  |  35.5426| 139.5975|        5|2020-07-31 18:00:00|
  +---------+---------+---------+-------------------+ 
  |  49.2466|-123.2214|        5|2020-06-23 11:00:00|
  +---------+---------+---------+-------------------+ 
  ...

I want to add the temperatures at these locations and time in a new column from the Meteostat library in Python.

The library has the "Point" class. For a single location, it works like this:

location = Point(40.416775, -3.703790, 660)

You can now use this in the class "Hourly" that gives you a dataframe of different climatic variables. (normally you use like "start" and "end" to get values for every hour in this range, but using "start" twice, gives you only one row for the desired time). The output is just an example how the dataframe looks like.

data = Hourly(location, start, start).fetch()
print (data)

>                      temp  dwpt  rhum  prcp  ...  wpgt    pres  tsun  coco
time                                         ...                          
2020-01-10 01:00:00 -15.9 -18.8  78.0   0.0  ...   NaN  1028.0   NaN   0.0

What I want to do now, is to use the values from the dataframe "geo_time" as parameters for the classes to get a temperature for every row. My stupid idea was the following:

geo_time['location'] = Point(geo_time['latitude'], geo_time['longitude'], geo_time['altitude'])

data = Hourly(geo_time['location'], geo_time['start'], geo_time['start'])

Afterwards, I would add the "temp" column from "data" to "geo_time".

Does someone have an idea how to solve this problem or knows if Meteostat is even capable doing this?

Thanks in advance!

1

There are 1 best solutions below

0
Laurent On BEST ANSWER

With the dataframe you provided:

import pandas as pd

df = pd.DataFrame(
    {
        "latitude": [48.2393, 35.5426, 49.2466],
        "longitude": [11.5713, 139.5975, -123.2214],
        "altitude": [520, 5, 5],
        "start": ["2020-03-12 13:00:00", "2020-07-31 18:00:00", "2020-06-23 11:00:00"],
    }
)

Here is one way to do it with Pandas to_datetime and apply methods:

df["start"] = pd.to_datetime(df["start"], format="%Y-%m-%d %H:%M:%S")

df["temp"] = df.apply(
    lambda x: Hourly(
        Point(x["latitude"], x["longitude"], x["altitude"]),
        x["start"],
        x["start"],
    )
    .fetch()["temp"]
    .values[0],
    axis=1,
)

Then:

print(df)
# Output
   latitude  longitude  altitude               start  temp
0   48.2393    11.5713       520 2020-03-12 13:00:00  16.8
1   35.5426   139.5975         5 2020-07-31 18:00:00  24.3
2   49.2466  -123.2214         5 2020-06-23 11:00:00  14.9