How to pass unpicklable variables to multiprocessing pool?

49 Views Asked by At

I've been searching for days but I can't figure out how to pass an unpickleable variables to a multiprocessing Pool. Is it even possible?

I'm new to multiprocessing and I may miss a few concepts here and there. However, I tried to reproduce some basic examples in my case where I'm trying to parallelize an extraction of an Abaqus result file. For this extraction to work, I would need to pass each worker a "step" object, that is a not picklable Abaqus object, as well as zip files open for writing. I've tried to use several functions of the multiprocessing library but it looks like this pickling is a showstopper for me.

Another constraint I have is the use of Python 2. Indeed, I must use the Python distribution embeded in Abaqus and they've switch to Python 3 only in the very latest version I'm not using yet.

Could someone share an example of multiprocessing.Pool with unpicklable variables? I don't care sticking to multiprocessing library if there's another one allowing me to parallelize the workers in my case. Just keep in mind I can't modify the Python distribution in Abaqus and I may not have access to exotic libraries...

EDIT: Here is a simplified code of my approach

from odbAccess import *
import multiprocessing as mp

def extraction(step):
    # Do something with "step"

if __name__ == '__main__':
    odb = openOdb("path_to_my_odb_file")
    pool = mp.Pool(2)

    pool.map(extraction, odb.steps.values())
    pool.close()
    pool.join()

I get the error:

pickle.PicklingError: Can't pickle <type 'OdbStep'>: it's not found as __builtin__.OdbStep
2

There are 2 best solutions below

0
Avatar36 On BEST ANSWER

I've changed the package and I'm now using threading instead of multiprocessing. More or less the same syntax and much less limitations.

2
Louis Lac On

Try to identify what is strictly necessary to share between concurrent workers when using concurrency (multiprocessing in your case). In general, you want to avoid sharing mutable state between concurrent workers -like a writable file- since this requires special locking and synchronisation on your side.

Try to only share read-only data between processed and thread and make them returning the processed data instead of mutating a shared state. You can then gather on one thread/process (the main process for instance) the aggregated results from the concurrent workers and write them to a file non-concurrently.

If some data is unpicklable, this means that multiprocessing over this value is likely wrong without more involved thinking.

You can post a minimally reproducible example so that we can see your current code and help you more.