Assign Week Number Column to Dataframe with Defined Dict in Python

22 Views Asked by At

I have been trying to get this to work and cannot find a solution. I have data that looks like this in dataframe (df):

index   plant_name  business_name           power_kwh   mos_time        day month   year
0       PROVIDENCE HEIGHTS  UNITED STATES   7805.7  2023-02-25 08:00:00 56  2   2023
1       PROVIDENCE HEIGHTS  UNITED STATES   9943.7  2023-02-25 07:00:00 56  2   2023
2       PROVIDENCE HEIGHTS  UNITED STATES   9509.8  2023-02-25 06:00:00 56  2   2023
3       PROVIDENCE HEIGHTS  UNITED STATES   8333    2023-02-25 05:00:00 56  2   2023
2993    PROVIDENCE HEIGHTS  UNITED STATES   14560   2022-10-25 17:00:00 298 10  2022
2994    PROVIDENCE HEIGHTS  UNITED STATES   9260.4  2022-10-25 16:00:00 298 10  2022
2995    PROVIDENCE HEIGHTS  UNITED STATES   7327.7  2022-10-25 15:00:00 298 10  2022
2996    PROVIDENCE HEIGHTS  UNITED STATES   5579.1  2022-10-25 14:00:00 298 10  2022
2997    PROVIDENCE HEIGHTS  UNITED STATES   4507    2022-10-25 13:00:00 298 10  2022
13993   PROVIDENCE HEIGHTS  UNITED STATES   1655.3  2021-07-19 14:00:00 200 7   2021
13994   PROVIDENCE HEIGHTS  UNITED STATES   1686.1  2021-07-19 13:00:00 200 7   2021
13995   PROVIDENCE HEIGHTS  UNITED STATES   2243.7  2021-07-19 12:00:00 200 7   2021
13996   PROVIDENCE HEIGHTS  UNITED STATES   3577.9  2021-07-19 11:00:00 200 7   2021
33995   PROVIDENCE HEIGHTS  UNITED STATES   2220.2  2019-04-05 20:00:00 95  4   2019
33996   PROVIDENCE HEIGHTS  UNITED STATES   2266.7  2019-04-05 19:00:00 95  4   2019
33997   PROVIDENCE HEIGHTS  UNITED STATES   2292.4  2019-04-05 18:00:00 95  4   2019
33998   PROVIDENCE HEIGHTS  UNITED STATES   2197    2019-04-05 17:00:00 95  4   2019

The dict that I need to use to assign week numbers looks like this:

weeks = {
    1: [1, 2, 3, 4, 5],
    2: [6, 7, 8, 9],
    3: [10, 11, 12, 13],
    4: [14, 15, 16, 17],
    5: [18, 19, 20, 21, 22],
    6: [23, 24, 25, 26],
    7: [27, 28, 29, 30, 31],
    8: [32, 33, 34, 35],
    9: [36, 37, 38, 39],
    10: [40, 41, 42, 43, 44],
    11: [45, 46, 47, 48],
    12: [49, 50, 51, 52]
}

And, my answer looks like this with the added "week" column on the far right column. thank you for any help here.

index   plant_name  business_name           power_kwh   mos_time    day month   year    week
0       PROVIDENCE HEIGHTS  UNITED STATES   7805.7  2023-02-25 08:00:00 56  2   2023    8
1       PROVIDENCE HEIGHTS  UNITED STATES   9943.7  2023-02-25 07:00:00 56  2   2023    8
2       PROVIDENCE HEIGHTS  UNITED STATES   9509.8  2023-02-25 06:00:00 56  2   2023    8
3       PROVIDENCE HEIGHTS  UNITED STATES   8333    2023-02-25 05:00:00 56  2   2023    8
2993    PROVIDENCE HEIGHTS  UNITED STATES   14560   2022-10-25 17:00:00 298 10  2022    43
2994    PROVIDENCE HEIGHTS  UNITED STATES   9260.4  2022-10-25 16:00:00 298 10  2022    43
2995    PROVIDENCE HEIGHTS  UNITED STATES   7327.7  2022-10-25 15:00:00 298 10  2022    43
2996    PROVIDENCE HEIGHTS  UNITED STATES   5579.1  2022-10-25 14:00:00 298 10  2022    43
2997    PROVIDENCE HEIGHTS  UNITED STATES   4507    2022-10-25 13:00:00 298 10  2022    43
13993   PROVIDENCE HEIGHTS  UNITED STATES   1655.3  2021-07-19 14:00:00 200 7   2021    30
13994   PROVIDENCE HEIGHTS  UNITED STATES   1686.1  2021-07-19 13:00:00 200 7   2021    30
13995   PROVIDENCE HEIGHTS  UNITED STATES   2243.7  2021-07-19 12:00:00 200 7   2021    30
13996   PROVIDENCE HEIGHTS  UNITED STATES   3577.9  2021-07-19 11:00:00 200 7   2021    30
33995   PROVIDENCE HEIGHTS  UNITED STATES   2220.2  2019-04-05 20:00:00 95  4   2019    14
33996   PROVIDENCE HEIGHTS  UNITED STATES   2266.7  2019-04-05 19:00:00 95  4   2019    14
33997   PROVIDENCE HEIGHTS  UNITED STATES   2292.4  2019-04-05 18:00:00 95  4   2019    14
33998   PROVIDENCE HEIGHTS  UNITED STATES   2197    2019-04-05 17:00:00 95  4   2019    14

I have tried things like the function below but get the ValueError: Week number not found in dictionary:

weeks = {
    1: [1, 2, 3, 4, 5],
    2: [6, 7, 8, 9],
    3: [10, 11, 12, 13],
    4: [14, 15, 16, 17],
    5: [18, 19, 20, 21, 22],
    6: [23, 24, 25, 26],
    7: [27, 28, 29, 30, 31],
    8: [32, 33, 34, 35],
    9: [36, 37, 38, 39],
    10: [40, 41, 42, 43, 44],
    11: [45, 46, 47, 48],
    12: [49, 50, 51, 52]
}
def get_week_number(day, month):
    for num, days in weeks.items():
        if day in days and num <= 12:
            if month == num:
                return num
    raise ValueError("Week number not found in dictionary.")

# add a column for week number
df['week'] = ncData.apply(lambda row: get_week_number(row['day'], row['month']), axis=1)
1

There are 1 best solutions below

3
Timeless On

Why not simply use isocalendar().week ?

df["mos_time"] = pd.to_datetime(df["mos_time"])

df["week"] = df["mos_time"].dt.isocalendar().week

Output :

print(df)

               plant_name  business_name  power_kwh            mos_time  day  month  year  week
0      PROVIDENCE HEIGHTS  UNITED STATES     7805.7 2023-02-25 08:00:00   56      2  2023     8
1      PROVIDENCE HEIGHTS  UNITED STATES     9943.7 2023-02-25 07:00:00   56      2  2023     8
2      PROVIDENCE HEIGHTS  UNITED STATES     9509.8 2023-02-25 06:00:00   56      2  2023     8
3      PROVIDENCE HEIGHTS  UNITED STATES     8333.0 2023-02-25 05:00:00   56      2  2023     8
2993   PROVIDENCE HEIGHTS  UNITED STATES    14560.0 2022-10-25 17:00:00  298     10  2022    43
2994   PROVIDENCE HEIGHTS  UNITED STATES     9260.4 2022-10-25 16:00:00  298     10  2022    43
2995   PROVIDENCE HEIGHTS  UNITED STATES     7327.7 2022-10-25 15:00:00  298     10  2022    43
2996   PROVIDENCE HEIGHTS  UNITED STATES     5579.1 2022-10-25 14:00:00  298     10  2022    43
2997   PROVIDENCE HEIGHTS  UNITED STATES     4507.0 2022-10-25 13:00:00  298     10  2022    43
13993  PROVIDENCE HEIGHTS  UNITED STATES     1655.3 2021-07-19 14:00:00  200      7  2021    29
13994  PROVIDENCE HEIGHTS  UNITED STATES     1686.1 2021-07-19 13:00:00  200      7  2021    29
13995  PROVIDENCE HEIGHTS  UNITED STATES     2243.7 2021-07-19 12:00:00  200      7  2021    29
13996  PROVIDENCE HEIGHTS  UNITED STATES     3577.9 2021-07-19 11:00:00  200      7  2021    29
33995  PROVIDENCE HEIGHTS  UNITED STATES     2220.2 2019-04-05 20:00:00   95      4  2019    14
33996  PROVIDENCE HEIGHTS  UNITED STATES     2266.7 2019-04-05 19:00:00   95      4  2019    14
33997  PROVIDENCE HEIGHTS  UNITED STATES     2292.4 2019-04-05 18:00:00   95      4  2019    14
33998  PROVIDENCE HEIGHTS  UNITED STATES     2197.0 2019-04-05 17:00:00   95      4  2019    14