Lag values of rows of a column on rows of another column

44 Views Asked by At

Please can anyone help me navigate through this logic, I have tried using the lag function, but it doesn't correctly with my window.partition("component').Orderby("x"). Thanks!

conditions: if x value is 0 return current_date on new column inb_date_assigned if x > max of cum_inb return null else

  • return inb_date[1], where cum_inb = 32 on x until x= 22, because 22 <= 36 (cum_inb)[1],
  • return inb_date[2], where cum_inb = 68 on x = 52 because 52 <= 68 but greater than 36 (cum_inb)[1], continue till, where x <= max(cum_inb). otherwise return null

my df:

x   |        inb_date      |   cum_inb  
0   |      2024-03-25      |      32      
6   |      2024-05-01      |      64    
14  |      2024-08-07      |      104   
22  |       null           |      null      
32  |       null           |      null      
62  |       null           |      null      
82  |       null           |      null      
118 |       null           |      null      
120 |       null           |      null  

expected result: to get values in cum_inb and inb_date_assigned:

x   | inb_date    | cum_inb   |  inb_date_assigned   
0   | 2024-03-25  |     32    | today's date   
6   | 2024-05-01  |     32    | 2024-03-25   
14  | 2024-08-07  |     32    | 2024-03-25   
22  |  null       |     32    | 2024-03-25   
52  |  null       |     64    | 2024-05-01   
82  |  null       |     104   | 2024-08-07      
118 |  null       |     null  | null  
120 |  null       |     null  | null  

I tried something like this but not working:

window_specx = Window.partitionBy("component").orderBy("x")

# Define the current_date as a constant column

current_date_col = F.current_date()

# Apply the conditions directly within withColumn

df_result = final_df.withColumn("inb_date_assigned", 
F.when(F.col("x") == 0, current_date_col)
.when((F.col("cum_inb_qty").isNotNull()) & (F.col("cum_inb_qty") - F.col("x") \>= 0),
F.last("inbound_date", ignorenulls=True).over(window_specx)
).otherwise(None))
0

There are 0 best solutions below