I'm trying to mirror this behavior from pandas in polars:
AGG_DEFAULTS = {
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last',
'volume': 'sum',
'transactions': 'sum',
}
df = df.resample('30min').agg(AGG_DEFAULTS).dropna()
The closest I've come is something like:
result = (df.upsample("time", every="30m")
.groupby("time")
.agg([
pl.col("open").first().alias("first_open"),
pl.col("high").max().alias("max_high"),
pl.col("low").min().alias("min_low"),
pl.col("close").last().alias("last_close"),
pl.col("volume").sum().alias("sum_volume"),
pl.col("transactions").sum().alias("sum_transactions")
])
)
However, this doesn't work. The aggregate row uses the values from the first row of the upsampled data.
This is how to accomplish: