I'm working on a make-like system written in Python and I want to be able to throttle how many cores are in use for parallel building, similar to the -j/--jobs option supported by GNU make. Each build "job" is an asyncio.Task, and may spawn subprocesses with as part of its work. I require jobs to only spawn external processes through a function I provide, so I can track how many external processes are running at any given time, and add/subtract from the in-use core count accordingly (for now I naively assume every external process is single threaded) and block waiting for more cores if necessary using a semaphore.
Separate from the external processes there is a question of CPython itself. A build job may launch a subprocess asychronously, in which case 2 cores should be required, one for CPython and one for the external job. However if the build job launches a subprocess and waits on it, and no other build jobs are running (all other asyncio.Task are blocked) then only 1 core should be required, since CPython as a whole is blocked.
In other words, since the CPython interpreter is single threaded even when using asyncio.Tasks, I think CPython itself should only ever count as consuming either exactly 0 cores (all tasks blocked) or 1 cores (at least 1 task unblocked). However this requires being able to query from within the current task if any other tasks are currently runnable; if not, and we are about to block waiting on a process to complete, we should temporarily decrement the in-use core count since CPython is about to sleep, and then we can increment it back as soon as any task resumes running.
Is this possible with asyncio? I see I can query if a task is finished but I don't see a runnable concept. Do I need my own event loop implementation for this? How would I do it?
I think it hard and next to impossible to implement without a custom event loop, or some monkey patching shenanigans.
Remember, a task can wait for another internal task such as
sleep(0). Does that mean it's going to stop? Theoretically it is waiting on a task. Practically it is sleeping for 0 seconds. You can create an ugly timer for 0.00001 seconds and check if the task resumed but it's ugly.What you're looking for is to either patch the selectors and force a selector event loop, and if the select receives time greater than 0, it means it's about to sleep (barring incoming network activity or other sorts), or completely create a new event loop, with the major reconstruction in the
_run_once()section.I don't think it is possible with introspection of
all_tasks()as it will require lots of ugly jumps into the internals, and will result in the same need of monkey patching probably, together with building a dependency tree for the tasks.Altogether, it might probably be easier to approach it outside the box and introspect the Python program for CPU usage, or just let the OS schedule and choose the cores by launching an additional process just in case.
If you want, I can probably have a take at monkey patching the selectors event loop. Shouldn't be too hard...