Group reocurring elements from multiple lists to make just one path to a file

43 Views Asked by At

Let's assume I have lists like these:

years=['2023','2023','2023','2021']

fruits=['banana','orange','banana','orange']

day=['448','558','338','965']

These lists can be zipped, because each of them is in a correct order (i.e. 2023, banana, and 448 belong together). I use 'years' and 'fruits' in determining the path to file (not a text file, the original is a trajectory file) from which I want to extract a frame based on 'days':

for v,n,o in zip(years,fruits,days):
    path_to_file = "{}/y{}/file" .format(v,n) #then continues specific code for package working with trajectories until I reach the part where I extract the frames based on 'days'
    for i in file.timestep[o]:
        write(some_data)                

As a result I will have files like: 2023-banana (containing data from 448),2023-orange (containing data from 558),2023-banana (containing data from 338), etc.

Now, we can see that 1. and 3. positions in 'years' and 'fruits' made the same couples of variables (ie. both 2023+banana), thus I basically open this file (2023+banana) several times to extract the 'days' (448 and 338) individually which is inefficient because I can extract the 'days' at once (it is allowed, and I can store it in one outputfile).

So, my question is, whether there is a way how I can somehow group such reoccurring elements (which will make the same path) from multiple lists, thus I don't have to open the same file several times but just once. The desired outcome would be to visit file 2023+banana just once and do some operation just once. Thanks a lot.

1

There are 1 best solutions below

0
TheHungryCub On BEST ANSWER

You can use a dictionary to group the paths by unique combinations of 'year and fruit', then process each unique combination once.

from collections import defaultdict

years = ['2023', '2023', '2023', '2021']
fruits = ['banana', 'orange', 'banana', 'orange']
days = ['448', '558', '338', '965']

# Dictionary to store paths grouped by year and fruit
file_paths = defaultdict(list)

# Group paths by year and fruit
for year, fruit, day_value in zip(years, fruits, days):
    file_paths[(year, fruit)].append(day_value)

# Process each group
for (year, fruit), day_values in file_paths.items():
    path_to_file = "{}/y{}/file".format(year, fruit)  
    # Open file and process all day values at once
    with open(path_to_file, 'r') as file:
        # Read the file or do any other processing
        for day_value in day_values:
            # Process day value
            print(f"Processing {day_value} from {path_to_file}")
            # Example: for i in file.timestep[day_value]:
            #         write(some_data)