Upsample polars and aggregate

183 Views Asked by At

I'm trying to mirror this behavior from pandas in polars:

AGG_DEFAULTS = {
    'open': 'first',
    'high': 'max',
    'low': 'min',
    'close': 'last',
    'volume': 'sum',
    'transactions': 'sum',
}

df = df.resample('30min').agg(AGG_DEFAULTS).dropna()

The closest I've come is something like:

result = (df.upsample("time", every="30m")
       .groupby("time")
       .agg([
          pl.col("open").first().alias("first_open"),
          pl.col("high").max().alias("max_high"),
          pl.col("low").min().alias("min_low"),
          pl.col("close").last().alias("last_close"),
          pl.col("volume").sum().alias("sum_volume"),
          pl.col("transactions").sum().alias("sum_transactions")
       ])
)

However, this doesn't work. The aggregate row uses the values from the first row of the upsampled data.

1

There are 1 best solutions below

0
T3metrics On

This is how to accomplish:

results = (
    df
    .groupby_dynamic('time', every='30m')
    .agg([
        pl.col('open').first(),
        pl.col('high').max(),
        pl.col("low").min(),
        pl.col('close').last(),
        pl.col('volume').sum(),
        pl.col('transactions').sum(),
    ])
)