The dataframe I have has a column without timezones. But due to DST changes I had some problems datamanipulation. These problems resulted from having 'missing' rows when the clock got changed foreward and 'duplicate' rows when the clock changes back.
I wrote a function to add the timezone which automaticaly takes DST into acount.
def fix_dutch_clock_shift(df):
df_fixed = df.copy()
pd.to_datetime(df_fixed['datetime'])
amsterdam_tz = pytz.timezone('Europe/Amsterdam')
for index, row in df_fixed.iterrows():
dt = row['datetime']
dt = amsterdam_tz.localize(dt)
df_fixed.at[index, 'datetime'] = dt
return df_fixed
This worked perfectly for the clock changing foreward changing the timezone from +01:00 to +02:00 at the correct datetime, but for the clock going backward I still got duplicates.
Datetime, Long, Short
2022-10-30 02:00:00+02:00,61.58,61.58
2022-10-30 02:15:00+02:00,49.98,49.98
2022-10-30 02:30:00+02:00,26.72,26.72
2022-10-30 02:45:00+02:00,-111.32,-111.32
2022-10-30 02:00:00+02:00,-111.32,-111.32
2022-10-30 02:15:00+02:00,-130.19,-130.19
2022-10-30 02:30:00+02:00,-6.69,-6.69
2022-10-30 02:45:00+02:00,-130.19,-130.19
But I want the timezone to shift back from +02:00 to +01:00 the second time it hits 2022-10-30 02:00.
Datetime, Long, Short
2022-10-30 02:00:00+02:00,61.58,61.58
2022-10-30 02:15:00+02:00,49.98,49.98
2022-10-30 02:30:00+02:00,26.72,26.72
2022-10-30 02:45:00+02:00,-111.32,-111.32
2022-10-30 02:00:00+01:00,-111.32,-111.32
2022-10-30 02:15:00+01:00,-130.19,-130.19
2022-10-30 02:30:00+01:00,-6.69,-6.69
2022-10-30 02:45:00+01:00,-130.19,-130.19
IIUC, your datetime data has a UTC offset (+02:00) over the whole year, which should be +01:00 or +02:00, depending on DST being active or not.
You can potentially achieve this by removing the offset first, then setting the correct time zone: