The problem (I think)
The contextlib.asynccontextmanager documentation gives this example:
@asynccontextmanager
async def get_connection():
conn = await acquire_db_connection()
try:
yield conn
finally:
await release_db_connection(conn)
It looks to me like this can leak resources. If this code's task is cancelled while this code is on its await release_db_connection(conn) line, the release could be interrupted. The asyncio.CancelledError will propagate up from somewhere within the finally block, preventing subsequent cleanup code from running.
So, in practical terms, if you're implementing a web server that handles requests with a timeout, a timeout firing at the exact wrong time could cause a database connection to leak.
Full runnable example
import asyncio
from contextlib import asynccontextmanager
async def acquire_db_connection():
await asyncio.sleep(1)
print("Acquired database connection.")
return "<fake connection object>"
async def release_db_connection(conn):
await asyncio.sleep(1)
print("Released database connection.")
@asynccontextmanager
async def get_connection():
conn = await acquire_db_connection()
try:
yield conn
finally:
await release_db_connection(conn)
async def do_stuff_with_connection():
async with get_connection() as conn:
await asyncio.sleep(1)
print("Did stuff with connection.")
async def main():
task = asyncio.create_task(do_stuff_with_connection())
# Cancel the task just as the context manager running
# inside of it is executing its cleanup code.
await asyncio.sleep(2.5)
task.cancel()
try:
await task
except asyncio.CancelledError:
pass
print("Done.")
asyncio.run(main())
Output on Python 3.7.9:
Acquired database connection.
Did stuff with connection.
Done.
Note that Released database connection is never printed.
My questions
- This is a problem, right? Intuitively to me, I expect
.cancel()to mean "cancel gracefully, cleaning up any resources used along the way." (Otherwise, why would they have implemented cancellation as exception propagation?) But I could be wrong. Maybe, for example,.cancel()is meant to be fast instead of graceful. Is there an authoritative source that clarifies what.cancel()is supposed to do here? - If this is indeed a problem, how do I fix it?
Focusing on protecting the cleanup from cancellation is a red herring. There is a multitude of things that can go wrong and the context manager has no way to know
It is the responsibility of the resource handling utilities to properly handle errors.
release_db_connectionmust not be cancelled, it must protect itself against cancellation.async withcontext manager. Further protection, e.g. against cancellation, may be involved internally as well.Note: Asynchronous cleanup is tricky. For example, a simple
asyncio.shieldis not sufficient if the event loop does not wait for shielded tasks. Avoid inventing your own protection and rely on the underlying frameworks to do the right thing.The cancellation of a task is a graceful shutdown that a) still allows async operations and b) may be delayed/suppressed. Coroutines being prepared to handle the
CancelledErrorfor cleanup is explicitly allowed.A forceful shutdown is
coroutine.close/GeneratorExit. This corresponds to an immediate, synchronous shutdown and forbids suspension viaawait,async fororasync with.