handling multiprocessing.Manager() server process termination

98 Views Asked by At

I'm writing my first multiprocessing application.

When the OS kills the server process spawned by multiprocessing.Manager(), whether it's due to OOM or if I manually SIGKILL it, some but not all of my child processes also terminate. My application is left in a bad state and systemctl does not restart it as I would like to have happen.

I tried to implement a watchdog child process to watch over its siblings and the manager's server process however it terminates with the manager process.

It seems my options are either:

  1. Perform the watchdog function in the parent process, or:
  2. Have my child watchdog process send a keepalive to systemd via the sd_notify() API and configure systemd to restart the service absent this signal.

Are there other options worthy of considering? Which is the best and why?

1

There are 1 best solutions below

2
Entropy On

Unfortunately there is no clear way to remove orphaned child threads when you SIGKILL the parent of the process tree. SIGKILL is intended to forcibly close an application. However, you can have a custom handling of SIGINT so that you can write your code to exit more gracefully...

#!/usr/bin/env python
import signal
import sys
import multiprocessing
from functools import partial # * if you don't want this you don't need it... its cleaner though

def worker_function():
    print("Worker process is running...")
    try:
        while True:
            # Your worker logic goes here
            pass
    except KeyboardInterrupt:
        print("Worker process received Ctrl+C signal.")
    finally:
        print("Worker process is exiting.")

def signal_handler(sig, frame, processes):
    print('Main process received Ctrl+C signal. Terminating child processes...')
    for process in processes:
        process.terminate()
        process.join()
    print('Main process is exiting.')
    sys.exit(0)

if __name__ == "__main__":
    processes = []  # Create an empty list to hold references to child processes
    # using lambda function for closure, passes in pointer to processes, works without functools
    # signal.signal(signal.SIGINT, (lambda sig, frame: signal_handler(sig, frame, processes)))
    signal.signal(signal.SIGINT, partial(signal_handler, processes=processes))
    print('Press Ctrl+C to exit')

    try:
        for i in range(3):
            process = multiprocessing.Process(target=worker_function)
            processes.append(process)
            process.start()
        for process in processes:
            process.join()
    except KeyboardInterrupt:
        signal_handler(signal.SIGINT, None, processes)