I am trying to terminate whole supervision tree from a supervised worker process. Here is my supervision tree:
+--------------------------+
| |
+--------+ Sup1: Dynamic Supervisor +---------+
| | | |
| +-------------+------------+ |
| | |
| | |
v v v
+------------------+ +------------------+ +------------------+
| | | | | |
| Job1: Supervisor | | Job2: Supervisor | | Job3: Supervisor |
| | | | | |
+------------------+ +-+-------- +---+--+ +------------------+
| |
| |
| |
| |
v v
+-------------------+ +--------------+
| | | |
| Progress Monitor: | | Work: Worker |
| Worker | | |
| | +--------------+
+-------------------+
Process life cycle:
- A
Jobis started via:DynamicSupervisor.start_child(__MODULE__, spec) - Each job is a supervision tree as well: 1 supervisor (restart strategy -
one_for_one) -> 2 workers Progress Monitorworker knows when the given job is done- On job done,
Progress Monitorworker makes an attempt to terminate the whole job supervision tree, by calling:DynamicSupervisor.terminate_child(__MODULE__, pid) Progress Monitoris expected to do cleanup steps interminatecallback - it is trapping exit signals
Problems and observations:
DynamicSupervisor.terminate_childis a blocking call, which means it waits for all child processes to terminate as well, including the calling process -Progress MonitorProgress Monitoris in a deadlock and can not terminate. Parent supervisor sends:killsignal, which does not triggerterminatecallback
Quick workarounds:
Call
DynamicSupervisor.terminate_childfromProgress Monitorworker asynchronously:spawn(fn -> DynamicSupervisor.terminate_child(__MODULE__, pid) end)Define shutdown strategy for
Sup1: Dynamic Supervisor:shutdown: 5_000It will wait at most 5s for a job supervision tree termination and then it will send
shutdownexit signal. This will ensureterminatecallback being called forProgress Monitorprocess.
Not happy with both of them.
Questions:
- How to trigger supervision tree termination from a worker process and avoid deadlocks?
- If terminating supervision tree from a worker is not the best practice, what is the recommended way then?
- Any recommendations how to redesign supervision tree to make graceful termination easier?
Just call it in async task
Task.async(fn -> Process.exit(Sup1, :shutdown) end)it will terminate Sup1 and with it all children will shutdownEDIT:
If you need prettier solution, it depends what elese you need. In most cases, I create Bootstrapper worker that will do initialization and some other stuff. You could add easily other features.
So considering above, and just roughly speaking, I would add in a layer above (
AppSupervisor), Another DynamicSupervisor so it can start Bootstrapper and passself()to it (or register it under local name to avoid this injection). After that, on start, Bootstrap worker will start Sup1 (your dynamic supervisor) and await for other messages, e.g.:terminate_sup1that will shutdownSup1process. Later, in some of below workers you can shutdownSup1by casting:terminate_sup1message to bootstraper. Also there is a door that allow you to start again Sup1 when another message is sent to bootstrap worker.Further more, if you just need to shutdown Sup1, just go with Task. But if you need control, then put it into single worker process that should have control over it, when it is up or down.