changes made in function does not change the object in multiprocessing

186 Views Asked by At

In the below code in function one I try to add 1 with the len(test.a) and append it to test.a it actually happen but when the program exits the function the test goes back to how it was earlier what I expect is when I print(a1.a) I would like to get [1,2,3] but I am getting [1,2]

import multiprocessing

class abc():
    def __init__(self):
        self.a=[1,2]
        self.b=[1,2,3]
def one(test):
    a=len(test.a)+1
    test.a.append(a)
    print('test.a==',test.a)
    return test

if __name__ == '__main__':
    with multiprocessing.Manager() as manager:
        a1=abc()
        a2=abc()

        b1 = manager.list([a1])
        b2 = manager.list([a2])
        # print('type(a1)==', type(a1[0]))
        p1 = multiprocessing.Process(target=one, args=[b1[0]])
        p2 = multiprocessing.Process(target=one, args=[b2[0]])

        p1.start()
        p2.start()
        p1.join()
        p2.join()


        print(a1.a)




output is:
test.a== [1, 2, 3]
test.a== [1, 2, 3]
[1, 2]
2

There are 2 best solutions below

0
Frank Yellin On BEST ANSWER

From the documentation:

If standard (non-proxy) list or dict objects are contained in a referent, modifications to those mutable values will not be propagated through the manager because the proxy has no way of knowing when the values contained within are modified. However, storing a value in a container proxy (which triggers a setitem on the proxy object) does propagate through the manager and so to effectively modify such an item, one could re-assign the modified value to the container proxy:

When you create the managed lists b1 and b2, the manager keeps track of changes to these lists only. You can append elements to the lists, delete them, assign them. But the manager does not keep track of the elements of the list. If you modify them directly, the change does not get propogated.

1
Booboo On

You actually have two problems:

First, to amplify what Frank Yellin has said, you need to let the managed list know that one of its elements has been updated. In your code it does not know that the first element of the passed list has had an item appended to it. To let the managed list know that the first element has had an update, re-assign the first element back into the list as in the code below. To do that, you need to pass the actual list to worker function one.

The second issue is that the main process creates an instance of class abc and places it in list b1 (and does likewise for list b2). Even if we now pass the list b1 itself to one (and an index so it knows which element it should be updating), the instance of abc that one gets to modify is a copy of the instance that exists in the main process. This list and its contents actually exist in the process created when you execute the call to Manager(). The "list" being referenced by the main process and the one processes are proxies for these actual lists. When you execute a method on the proxy it sends the operation to be performed to the Manager's process to operate on the actual list. So if you print out a1.a in the main process, it will not have changed but the copy of a1 in the managed list will have changed.

class abc():
    def __init__(self):
        self.a = [1,2]
        self.b = [1,2,3]

def one(the_dict, index):
    test = the_dict[index]
    # test is not the same `abc` instance that the main process has.
    # It is a copy:
    a = len(test.a) + 1
    test.a.append(a)
    print('test.a==',test.a)
    # Show manager we have updated the list:
    the_dict[index] = test
    # This accomplished nothing:
    #return test

if __name__ == '__main__':
    import multiprocessing

    with multiprocessing.Manager() as manager:
        a1 = abc()
        a2 = abc()

        b1 = manager.list([a1])
        b2 = manager.list([a2])
        # print('type(a1)==', type(a1[0]))
        p1 = multiprocessing.Process(target=one, args=(b1, 0))
        p2 = multiprocessing.Process(target=one, args=(b2, 0))

        p1.start()
        p2.start()
        p1.join()
        p2.join()

        # Our copies of a1 and a2 have not been updated
        #print(a1.a)
        # However, the lists have been updated:
        print(b1[0].a)

Prints:

test.a== [1, 2, 3]
test.a== [1, 2, 3]
[1, 2, 3]

Note that if you were passing b1 to both processes, you would need to do the operation under control of a lock since one does not modify the passed list as an atomic operation or you therefore might not see the two updates. Note that I have changed some variable names to what I believe are more descriptive:

class abc():
    def __init__(self):
        self.a = [1,2]

def one(the_dict, index, lock):
    with lock:
        # Better, names for the variables
        abc_instance = the_dict[index]
        lnth = len(abc_instance.a) + 1
        abc_instance.a.append(lnth)
        print('abc_instance.a:', abc_instance.a)
        # Show manager we have updated the list:
        the_dict[index] = abc_instance

if __name__ == '__main__':
    import multiprocessing

    with multiprocessing.Manager() as manager:
        a = abc()

        the_list = manager.list([a])
        lock = multiprocessing.Lock()
        p1 = multiprocessing.Process(target=one, args=(the_list, 0, lock))
        p2 = multiprocessing.Process(target=one, args=(the_list, 0, lock))

        p1.start()
        p2.start()
        p1.join()
        p2.join()

        print(the_list[0].a)

Prints:

abc_instance.a: [1, 2, 3]
abc_instance.a: [1, 2, 3, 4]
[1, 2, 3, 4]