I am using rain as an intrumental variable, so I need to pull hisotry probablity of rain given location and time to each row. Prefer python since I clean most of my data on python.
| County | State | Date | Rain |
|---|---|---|---|
| Fulton | SC | 2019-1-1 | ? |
| Chatham | GA | 2017-9-3 | ? |
Probably looking for some python library and code to find the date and create the column. Any help would be appreciated! Thank you!
The obvious answer is a probability in historical / observed datasets does not exist. The probability is derived from probabilistic weather forecasts. When the weather went through, you can say if there was rain or not, means 1 or 0.
But from a data science perspective there can be alternative to that. E.g. you can build up a similarity tree or an Analog Ensemble to determine probability for rain on certain weather patterns. But you need more information about the weather and weather regime. At the your information will be independent from the date. The probability information will be a function on the day of year e.g.