My problem is as follows. I have a large array 'B' that I wish to operate on in parallel. My function looks like
def test(a,B):
entry = B[a,a]
#insert complicated math here
result = B[a,a]
return result
I understand that without an argument I can simply use process_map as follows:
parallel_results_tqdm = process_map(test, agrid,max_workers=4,chunksize=1).
Where agrid is the list of 'a' I wish to loop over. This will work if the variable B is a global. However, I wish to run the operation on different arrays and provide it as a function.
One solution that would not work is to pass a list of my inputs as tuples, such that
agrid = [(0,B),(1,B),(2,B),(3,B)....]
However in my case B is large, so cloning it in this way causes a MemoryError. Is there a way to pass process_map an argument without cloning it in this way?
I suggest instead to pass the large array
Bto each process to make the array shared between the processes. To do that you can usemultiprocessing.Array, for example:Prints (for example):