I have a following spark table
| lon | lat | Rainf | Evap | AvgSurfT |
|---|---|---|---|---|
| -124.9375 | 48.8125 | 52.83326 | 35.82973 | 286.1314 |
| -124.9375 | 48.9375 | 38.92641 | 46.2698 | 288.2968 |
| -124.9375 | 49.0625 | 28.72708 | 43.29089 | 287.6732 |
| -124.9375 | 49.1875 | 22.0683 | 45.7691 | 288.7706 |
| -124.9375 | 49.3125 | 19.8993 | 54.68368 | 291.3871 |
What I need is to calculate a column avgRainf which will contain an average of the variable Rainf within a 50 Km distance from the point.
So we take first row longitude and latitude, and check the whole table to find other rows in which distance from the first row is smaller than 50 Km. Then we calculate the average of Rainf in these rows, and put it in a new column named AvgRainf in the first row. We do this with all the rows.
I assume that spark windows would do this somehow but I can't achieve it.