I am trying to profile_report in ydata_profiling and I get the following error:
TypeError: descriptor 'to_pydatetime' for 'pandas._libs.tslibs.timestamps._Timestamp' objects doesn't apply to a 'datetime.date' object.
I figured that the pandas data frame I am using had some date fields and indeed those cause this error. I used print(df['effective_date'].apply(type)) and I got datetime.date as the data type in this field. I tried to convert it to the timestamp data type but I faced some challenges.
Why does the data type need to be timestamp? The to_pydatetime seems to convert timestamp into datetime which my data already is. Is there a way around this to opt out from profile_report trying to convert my data into something it already is?
I managed to get around this by forcing datetime64 for any columns where the source database (snowflake in my case) had the data stored as DATE or TIMESTAMP (any variant).
However, I'm running into the same error when I try against a different database (oracle), even after trying to force all columns to datetime64. Tried using errors='coerce' and 'ignore' in pd.to_datetime().
Debugging is made difficult because of the way the ydata-profiling library works. It does not start the computations that result in the error until you actually try to write the profile report file. When a break is encountered, the debugger doesn't know to go into the profiler code and instead keeps dropping me back to the line where the file creation is occurring.