I'm writing my first multiprocessing application.
When the OS kills the server process spawned by multiprocessing.Manager(), whether it's due to OOM or if I manually SIGKILL it, some but not all of my child processes also terminate. My application is left in a bad state and systemctl does not restart it as I would like to have happen.
I tried to implement a watchdog child process to watch over its siblings and the manager's server process however it terminates with the manager process.
It seems my options are either:
- Perform the watchdog function in the parent process, or:
- Have my child watchdog process send a keepalive to systemd via the sd_notify() API and configure systemd to restart the service absent this signal.
Are there other options worthy of considering? Which is the best and why?
Unfortunately there is no clear way to remove orphaned child threads when you SIGKILL the parent of the process tree. SIGKILL is intended to forcibly close an application. However, you can have a custom handling of SIGINT so that you can write your code to exit more gracefully...