polars datetime 5 minutes floor

1k Views Asked by At

I have polars dataframe with timestamp folumn of type datetime[ns] which value is 2023-03-08 11:13:07.831 I want to use polars efficiency to round timestamp to 5 minutes floor.

Right now I do:

import arrow

def timestamp_5minutes_floor(ts: int) -> int:
    return int(arrow.get(ts).timestamp() // 300000 * 300000)

df.with_columns([
    pl.col("timestamp").apply(lambda x: timestamp_5minutes_floor(x)).alias("ts_floor")
    ])

It is slow. How to improve it?

1

There are 1 best solutions below

2
Timus On BEST ANSWER

You could try to use .dt.truncate: With the sample dataframe

df = pl.DataFrame({
    "ts": ["2023-03-08 11:01:07.831", "2023-03-08 18:09:01.007"]
}).select(pl.col("ts").str.strptime(pl.Datetime, "%Y-%m-%d %H:%M:%S%.3f"))
┌─────────────────────────┐
│ ts                      │
│ ---                     │
│ datetime[ms]            │
╞═════════════════════════╡
│ 2023-03-08 11:01:07.831 │
│ 2023-03-08 18:09:01.007 │
└─────────────────────────┘

this

df = df.select(pl.col("ts").dt.truncate("5m"))

results in

┌─────────────────────┐
│ ts                  │
│ ---                 │
│ datetime[ms]        │
╞═════════════════════╡
│ 2023-03-08 11:00:00 │
│ 2023-03-08 18:05:00 │
└─────────────────────┘