Using concurrent.futures.ProcessPoolExecutor when we do not have access to the __main__ module

59 Views Asked by At

I am running a large python project with multiple files. I want to use a concurrent.futures.ProcessPoolExecutor to execute a compute bound piece of code.

The project is organized as : main.py -> execute.py -> helper.py

I run main.py, which calls a function in execute.py, which further calls a function (func()) in helper.py I want to speed-up a code block func() which is currently the bottleneck.

I can use ThreadPoolExecutor to speed the code up. However, because of python GIL, and also since the code block to be optimized is compute heavy, I wish to use ProcessPoolExecutor in hopes of gaining more speedup.

Now, here is the problem: Seems like unlike a ThreadPoolExecutor, a ProcessPoolExecutor needs access to the 'main' module. {refer to the 1st point in : https://superfastpython.com/processpoolexecutor-common-errors/}.

Hence, when I use a ProcessPoolExecutor inside func(), I get an error saying "concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending"

Is there any way I can leverage multiprocessing here (and create processpool instead of threadpool)? I cannot change the project structure (the way files are organized in the project).

Any help would be appreciated!

0

There are 0 best solutions below