Convert a Pandas Series from Timedelta to Microseconds

145 Views Asked by At

I have a Pandas Timedelta column that may be created like this:

import pandas as pd
tdelta_ser = pd.date_range(start='00:00:00', periods=3, freq='700ms') - pd.date_range(start='00:00:00', periods=3, freq='500ms') 
tdiff_df = pd.DataFrame(tdelta_ser, columns=['TimeDiff'])
print(tdiff_df)

                TimeDiff
0        0 days 00:00:00
1 0 days 00:00:00.200000
2 0 days 00:00:00.400000

Looking for a really concise one liner that will produce a new column with this time delta converted to microseconds, without making assumptions about the internal dtype of the pandas Timedelta column being in int64 nanoseconds.

Desired result


               TimeDiff  DiffUsec
0        0 days 00:00:00         0
1 0 days 00:00:00.200000    200000
2 0 days 00:00:00.400000    400000

I tried several methods. The most concise was the one below, but it makes assumptions about the internal working of the Timedetla column being int64 nsecs and requires a scaling factor of 1000 to get it right.

tdiff_df['DiffUsec'] = tdiff_df['TimeDiff'].astype('int64') / 1000
print(tdiff_df)

                TimeDiff  DiffUsec
0        0 days 00:00:00       0.0
1 0 days 00:00:00.200000  200000.0
2 0 days 00:00:00.400000  400000.0
3

There are 3 best solutions below

2
Suraj Shourie On

Since you've initialized your data with 'ms' you can get that as a new column:

tdiff_df['TimeDiff'].dt.components.milliseconds

Output:

0      0
1    200
2    400
Name: milliseconds, dtype: int64

But if you're timedelta has a different initialization, nanosecond or microsecond, you can get those by the dt.components attribute:

print(tdiff_df['TimeDiff'].dt.components)

Output:

  days  hours  minutes  seconds  milliseconds  microseconds  nanoseconds
0     0      0        0        0             0             0            0
1     0      0        0        0           200             0            0
2     0      0        0        0           400             0            0
1
user3046211 On

Another way is to use the total_seconds() method, which will return the total duration in seconds for each Timedelta. Multiplying the result by 1,000,000 since there are 1,000,000 microseconds in a second and this will give you the desired microseconds value. This should handle fractions of milliseconds as you pointed out.

import pandas as pd

tdelta_ser = pd.date_range(start='00:00:00', periods=3, freq='700ms') - pd.date_range(start='00:00:00', periods=3, freq='500ms') 
tdiff_df = pd.DataFrame(tdelta_ser, columns=['TimeDiff'])

tdiff_df['DiffUsec'] = (tdiff_df['TimeDiff'].dt.total_seconds() * 1e6).astype('int64')
print(tdiff_df)

which results in

                TimeDiff  DiffUsec
0        0 days 00:00:00         0
1 0 days 00:00:00.200000    200000
2 0 days 00:00:00.400000    400000
2
Gerard G On

After a bunch more tries here's a hidden gem that was buried in Pandas docs.

tdiff_df['DiffUsec'] = tdiff_df['TimeDiff'].dt.microseconds
print(tdiff_df)
                TimeDiff  DiffUsec
0        0 days 00:00:00         0
1 0 days 00:00:00.200000    200000
2 0 days 00:00:00.400000    400000