Multiprocessing : passing arguments to tqdm's process_map

124 Views Asked by At

My problem is as follows. I have a large array 'B' that I wish to operate on in parallel. My function looks like

def test(a,B):
    entry = B[a,a]
    #insert complicated math here
    result = B[a,a]
    return result

I understand that without an argument I can simply use process_map as follows:

parallel_results_tqdm = process_map(test, agrid,max_workers=4,chunksize=1). 

Where agrid is the list of 'a' I wish to loop over. This will work if the variable B is a global. However, I wish to run the operation on different arrays and provide it as a function.

One solution that would not work is to pass a list of my inputs as tuples, such that

agrid = [(0,B),(1,B),(2,B),(3,B)....]

However in my case B is large, so cloning it in this way causes a MemoryError. Is there a way to pass process_map an argument without cloning it in this way?

1

There are 1 best solutions below

3
Andrej Kesely On

I suggest instead to pass the large array B to each process to make the array shared between the processes. To do that you can use multiprocessing.Array, for example:

import ctypes
from multiprocessing import Array
from time import sleep

import numpy as np
from tqdm.contrib.concurrent import process_map

N = 1_000

B = None


def test(a):
    arr = np.frombuffer(B.get_obj()).reshape((N, N))

    # complicated function
    sleep(1)

    # if you want to write to the shared array use a lock:
    # with arr.get_lock():
    #     arr[a, a] = ...

    return arr[a, a]


if __name__ == "__main__":
    B = Array(ctypes.c_double, N * N)
    arr = np.frombuffer(B.get_obj()).reshape((N, N))
    arr[:] = np.random.uniform(size=(N, N))

    parallel_results_tqdm = process_map(
        test,
        [1, 2, 3, 4],
        max_workers=2,
        chunksize=1,
    )

    print(parallel_results_tqdm)

Prints (for example):

100%|██████████████████████████████████████████████| 4/4 [00:02<00:00,  2.00it/s]
[0.0814044002819776, 0.37765868633809085, 0.9714163615430239, 0.6476169398743046]