How to convert a string to a DatetimeWithNanoseconds format with python?

1k Views Asked by At

I have multiples strings representing timestamps. A few examples may be : 19551231 (%Y%m%d) or 20210216154500 (%Y%m%d%H%M%S). As you can see, the format may vary.

I'm looking for a way to convert all these different strings to a unique DatetimeWithNanoseconds format.

I know I can convert a timestamp to DatetimeWithNanoseconds using integers like this: DatetimeWithNanoseconds(2020, 6, 22, 17, 1, 30, nanosecond=0).

Does that means that I have to manually parse every string I get to get the relevant integers ? Is there a better way to do this ? Like the way the function strptime works (using strings like %Y%m%d to determine the layout of the string)

2

There are 2 best solutions below

0
cuzureau On BEST ANSWER

I learned that from a datetime format it is easy to extract hours for example just by calling date.hour (same for year, month, etc).

Knowing this, the way to convert a string to a DatetimeWithNanoseconds format takes these 2 easy steps:

  1. Convert the string to a datetime format:
date = '19551231'
date = datetime.datetime.strptime(date, '%Y%m%d')
  1. Convert to DatetimeWithNanoseconds:
nano = DatetimeWithNanoseconds(date.year, date.month, date.day, date.hour, date.minute, date.second, nanosecond=0)
3
J_H On

You offered 8- and 14-character example timestamps. It appears you want to tack on 9 or more zeros, converting them to uniform 23-character human-readable timestamps. At that point it would be straightforward to put it in rfc 3339 format and call from_rfc3339() to obtain a DatetimeWithNanoseconds.

Consider using a simple while loop:

while len(ts) < 23:
    ts += '0'
return ts

A better way to accomplish the same thing:

return ts + '0' * (23 - len(ts))

EDIT

You will want a couple of helpers here. Each one is unit testable, and offers a very simple API.

First one turns everything into uniform 23-char human-readable timestamps as I mentioned above.

Second would take the first 14 characters and turn it into integer seconds since epoch. Then tack on the nanoseconds. I have something like this in mind:

import datetime as dt

def to_nanosec(stamp: str):
    assert 23 == len(stamp), stamp
    d = dt.datetime.strptime(stamp[:14], '%Y%d%m%H%M%S')
    return 1e9 * d.timestamp() + int(stamp) % 1e9

Equivalently that 2nd term could be … + int(stamp[14:])

Prefer int(1e9), or 1_000_000_000, if returning an int is important.

You certainly could break out character ranges and put punctuation like : colon and Z between them prior to calling from_rfc3339(), but .strptime() might be more convenient here.


It's worth noting that numpy offers support for nanosecond precision.