Calculating an average of a variable within specific distance with SparkWindows

31 Views Asked by Piotr Puchalski At 27 January 2024 at 13:55

I have a following spark table

lon	lat	Rainf	Evap	AvgSurfT
-124.9375	48.8125	52.83326	35.82973	286.1314
-124.9375	48.9375	38.92641	46.2698	288.2968
-124.9375	49.0625	28.72708	43.29089	287.6732
-124.9375	49.1875	22.0683	45.7691	288.7706
-124.9375	49.3125	19.8993	54.68368	291.3871

What I need is to calculate a column avgRainf which will contain an average of the variable Rainf within a 50 Km distance from the point.

So we take first row longitude and latitude, and check the whole table to find other rows in which distance from the first row is smaller than 50 Km. Then we calculate the average of Rainf in these rows, and put it in a new column named AvgRainf in the first row. We do this with all the rows.

I assume that spark windows would do this somehow but I can't achieve it.

Original Q&A

Calculating an average of a variable within specific distance with SparkWindows

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in PYSPARK

Related Questions in GEOGRAPHIC-DISTANCE

Related Questions in SPARK-WINDOW-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions