how to implement single program multiple data (spmd) in python

950 Views Asked by At

i read the multiprocessing doc. in python and found that task can be assigned to different cpu cores. i like to run the following code (as a start) in parallel.

from multiprocessing import Process
import os

def do(a):
    for i in range(a):
        print i    

if __name__ == "__main__":
    proc1 = Process(target=do, args=(3,))
    proc2 = Process(target=do, args=(6,))
    proc1.start()   
    proc2.start()

now i get the output as 1 2 3 and then 1 ....6. but i need to work as 1 1 2 2 ie i want to run proc1 and proc2 in parallel (not one after other).

2

There are 2 best solutions below

0
Mike McKerns On

So you can have your code execute in parallel just by using map. I am using a delay (with time.sleep) to slow the code down so it prints as you want it to. If you don't use the sleep, the first process will finish before the second starts… and you get 0 1 2 0 1 2 3 4 5.

>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> p = Pool()
>>> 
>>> def do(a): 
...   for i in range(a):
...     import time
...     time.sleep(1)
...     print i
... 
>>> _ = p.map(do, [3,6])
0
0
1
1
2
2
3
4
5
>>> 

I'm using the multiprocessing fork pathos.multiprocessing because I'm the author and I'm too lazy to code it in a file. pathos enables you to do multiprocessing in the interpreter, but otherwise it's basically the same.

0
gopi On

You can also use the library pp. I prefer pp over multiprocessing because it allows for parallel processing across different cpus on the network. A function (func) can be applied to a list of inputs (args) using a simple code:

job_server=pp.Server(ncpus=num_local_procs,ppservers=nodes)
result=[job() for job in job_server.submit(func,input) for arg in args]

You can also check out more examples at: https://github.com/gopiks/mappy/blob/master/map.py