Skip to content

Nested TaskGroup can silently swallow cancellation request from parent TaskGroup #116720

Closed
@arthur-tacca

Description

@arthur-tacca

Bug report

Bug description:

In the following code snippet, I start an asyncio.TaskGroup called outer_tg then start another one within it called inner_tg. The inner task group is wrapped in an except* block to catch a specific type of exception. A background task in each task group raises a different exception at roughly the same time, so the inner one ends up (correctly) raising the inner exception in an ExceptionGroup, which is caught and discarded by the except* block. However, this means that there is now no longer any exception bubbling up through the call stack, and the main body of the outer task group just continues on, even allowing waiting on more routines (but not creating more tasks in the outer group, as it is still shutting down).

import asyncio

class ExceptionOuter(Exception):
    pass
class ExceptionInner(Exception):
    pass

async def raise_after(t, e):
    await asyncio.sleep(t)
    print(f"Raising {e}")
    raise e()

async def my_main():
    try:
        async with asyncio.TaskGroup() as outer_tg:
            try:
                async with asyncio.TaskGroup() as inner_tg:
                    inner_tg.create_task(raise_after(1, ExceptionInner))
                    outer_tg.create_task(raise_after(1, ExceptionOuter))
            except* ExceptionInner:
                print("Got inner exception")
            print("should not get here")
            await asyncio.sleep(0.2)
            print("waited")
            # outer_tg.create_task(asyncio.sleep(0.2)) # raises RuntimeError("TaskGroup shutting down")
    except* ExceptionOuter:
        print("Got outer exception")
    print("Done")

asyncio.run(my_main())

Expected vs observed behaviour:

Observed behaviour:

Raising <class '__main__.ExceptionInner'>
Raising <class '__main__.ExceptionOuter'>
Got inner exception
should not get here
waited
Got outer exception
Done

Expected behaviour: the rest of the main body of the outer task group is skipped, so we just see:

Raising <class '__main__.ExceptionInner'>
Raising <class '__main__.ExceptionOuter'>
Got inner exception
Got outer exception
Done

Variations:

Making either of these changes (or both) still gives the same issue:

  • Add a third line within the inner task group await asyncio.sleep(10). This means that the main body of the inner task group finishes with asyncio.CancelledError rather than just finishing before any exceptions are raised by tasks.

  • Replace inner_tg.create_task(raise_after(1, ExceptionInner)) with inner_tg.create_task(raise_in_cancel()), where:

async def raise_in_cancel():
    try:
        await asyncio.sleep(10)
    except asyncio.CancelledError:
        raise ExceptionInner()

It's fair to dismiss this second case because it violates the rule about not suppressing CancelledError).

Root cause:

I think the problem is that TaskGroup.__aexit__() never includes CancelledError in the ExceptionGroup it raises if there are any other types of exception, even if a parent task group is shutting down. That appears to be due to this code in TaskGroup.__aexit__() (specifically the and not self._errors part, because self._errors is the list of all non-cancellation exceptions):

# Propagate CancelledError if there is one, except if there
# are other errors -- those have priority.
if propagate_cancellation_error and not self._errors:
    raise propagate_cancellation_error

I'm not sure of the right solution. Here are a couple of possibilities that come to mind, but I'm not sure if either would work or what the wider implications would be.

  • Perhaps, after all child tasks are completed, TaskGroup.__aexit__(), after needs to walk up the stack of parent task groups and see if any of those are shutting down, and if so include a CancelledError in the ExceptionGroup.
  • Maybe CancelledError needs to include some metadata about what task group caused it to be raised and then it's only suppressed by the task group it's associated with (that's what Trio and AnyIO do I believe).

CPython versions tested on:

3.11, 3.12

Operating systems tested on:

Windows

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions