How to get the timeslots from the start time and end time in the pandas dataframe

496 Views Asked by At

I have a pandas dataframe where it has start_time, end_time and booking duration.

Please find below example dataframe

id Start_time End_time Duration
1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 0:30:00
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00
3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 1:00:00
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00

I am trying to get the timeslots from the above dataframe

Expected output:

id Start_time End_time Duration Timeslots
1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 0:30:00 9 - 10
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 9-10
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 10-11
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 11-12
3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 1:00:00 10 - 11
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00 8-9
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00 9-10

What I have tried till now

I can get the slots from start_time and end_time but i am missing the expected output

id Start_time End_time Duration TimeSlot
1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 0:30:00 9-9:30
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 9-12:10
3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 1:00:00 10-11
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00 8 - 9:40

Can anyone give some hints please

1

There are 1 best solutions below

0
Corralien On BEST ANSWER

Try:

def get_slots(row):
    dti = pd.date_range(row['Start_time'].floor('H'), 
                        row['End_time'].ceil('H'), freq='H')
    return [f"{s.hour:02}-{e.hour:02}" for s, e in zip(dti, dti[1:])]
    
out = df.assign(Timeslots=df.apply(get_slots, axis=1)).explode('Timeslots')
print(out)

# Output:
   id          Start_time            End_time        Duration Timeslots
0   1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 00:30:00     09-10
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     09-10
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     10-11
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     11-12
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     12-13
2   3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 01:00:00     10-11
3   4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00     08-09
3   4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00     09-10

Setup to be reproducible:

import pandas as pd
from pandas import Timestamp, Timedelta

data = {
'id': [1, 2, 3, 4],
'Start_time': [Timestamp('2013-11-20 09:00:00'), Timestamp('2013-11-20 09:00:00'), 
               Timestamp('2013-11-20 10:00:00'), Timestamp('2013-11-20 08:00:00')], 
'End_time': [Timestamp('2013-11-20 09:30:00'), Timestamp('2013-11-20 12:10:00'),
             Timestamp('2013-11-20 11:00:00'), Timestamp('2013-11-20 09:40:00')],
'Duration': [Timedelta('0 days 00:30:00'), Timedelta('0 days 03:10:00'),
             Timedelta('0 days 01:00:00'), Timedelta('0 days 01:40:00')]
}

df = pd.DataFrame(data)