Is list[str] an iterable?

1k Views Asked by At

Python 3.10 doesn't think so:

Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:38:29) [Clang 13.0.1 ] \
    on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from typing import Iterable
>>> isinstance(list[str], Iterable)
False
>>> list(list[str])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'types.GenericAlias' object is not iterable

Python 3.11 considers it is:

Python 3.11.0 | packaged by conda-forge | (main, Jan 15 2023, 05:44:48) [Clang 14.0.6 ] \
    on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from typing import Iterable
>>> isinstance(list[str], Iterable)
True
>>> list(list[str])
[*list[str]]

If it is an iterable, what should be the result of iterating over it? The *list[str] item appears to be the unpacking of itself or of a type variable tuple.
What's going on here? I know that typing in python is in state of flux and evolving rapidly, but I really don't know how to interpret this.

Update: Fixed typo in 3.10 example as noted by Daniil Fajnberg

Update: I didn't want a long post for a seemingly edge issue, but I suppose a bit of background is necessary.

I know of at least one case this has caused problems. I frequently use the Fastcore lib. In notebooks environments, the test_eq module is handy for documenting/testing code. The test_eq helper function checks for equality (==).

Up until version 3.11, the following code was okay:

test_eq(_generic_order((list[str],)), (list[str],))

(_generic_order is a function that order annotations by "genericity", which is not important now)

In version 3.11:

test_eq(_generic_order((list[str],)), (list[str],))

RecursionError                            Traceback (most recent call last)
Cell In[81], line 1
----> 1 test_eq(_generic_order((list[str],)), (list[str],))
...
File ~/dev/repo/project/rei/.micromamba/envs/rei/lib/python3.11/site-packages/fastcore/imports.py:33, in <genexpr>(.0)
     31 "Compares whether `a` and `b` are the same length and have the same contents"
     32 if not is_iter(b): return a==b
---> 33 return all(equals(a_,b_) for a_,b_ in itertools.zip_longest(a,b))
...
File ~/dev/repo/project/rei/.micromamba/envs/rei/lib/python3.11/typing.py:1550, in _SpecialGenericAlias.__subclasscheck__(self, cls)
   1548     return issubclass(cls.__origin__, self.__origin__)
   1549 if not isinstance(cls, _GenericAlias):
-> 1550     return issubclass(cls, self.__origin__)
   1551 return super().__subclasscheck__(cls)

File <frozen abc>:123, in __subclasscheck__(cls, subclass)

RecursionError: maximum recursion depth exceeded in comparison

test_eq checks internally whether the args are iterable and compares item-wise. As list[str] is an iterable that iterates over itself, recursion hell.

If test_eq wants to be truly general probably should guard against recursion. But it's understable that a case where a container contains itself at first level is rare. And test_eq is not the focus of the question, just an example of a problem generated by an undocumented 3.11 change.

Note that the old generics (List, Tuple, etc) use the typing module, but the built-in types (list, tuple) implement GenericAlias in C. Thus, a Python maintainer has put substantial effort into changing the behavior of GenericAlias to make them iterables, and not only that, but iterables of themselves. There isn't a single mention of this fact in the documentation or change logs (and no time yet to delve into CPython for git comments).

Is it an edge case? Probably. Python docs frequently warn us that using type annotations for purposes other than type hinting is discouraged. At the same time, type hint introspection becomes more powerful with every Python version.

Around the time typing was introduced, we also obtained dataclasses that are not only useful data containers by themselves but also elegant examples of leveraging type annotations dynamically at runtime. Pydantic and an ever-growing number of tools are using annotations to check/change code, and I have found myself using them more and more in my own code.

We put a lot of effort into typing because it's a powerful and useful tool, one that is likely useful aside from type checking. As a developer, I want to understand my tools, particularly one as important as typing.

So, the question remains: i'm curious about not how, which is trivial, but why did GenericAlias suddenly become iterable?

2

There are 2 best solutions below

4
ventaquil On

typing module doesn't contain parent classes - it provides only support for type hints (mostly used by IDEs or linters and so on).

list[str] is an iterable an you can check it with iter function.

>>> iter(["a", "b", "c"])
<list_iterator object at 0x7fee38ee7580>

You can read this answer for more details - https://stackoverflow.com/a/1952481

0
Fallible On

Thanks to @anthonysotille and @SUTerliakov for the hints.

What we are seeing here is an unintended consequence of the decisions taken when designing Variadic Generics (PEP 646) support for 3.11. In particular the implementation of the Unpack operator * in type annotations that involve TypeVarTuple.

Unpack is

A typing operator that conceptually marks an object as having been unpacked. ...

In 3.10:

from typing_extension import Unpack
from typing import Tuple

>>> Tuple[int, str], type(Tuple[int, str]), Unpack[Tuple[int, str]]
(typing.Tuple[int, str] <class 'typing._GenericAlias'> typing_extensions.Unpack[typing.Tuple[int, str]])

We already know that generic aliases weren't iterables,

>>> list(list[str])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'types.GenericAlias' object is not iterable

because they didn't need to.

Unpack works with any generic, in fact any type:

>>> Unpack[list[int]], Unpack[str]
(typing_extensions.Unpack[list[int]] typing_extensions.Unpack[str])

The runtime expression of Unpack is very lean, as any other typing construct, involving just the typing module. Although type hints are useful outside type context, Python try very hard not to impact runtime performance.

In 3.11, however, Python maintainers decided to use the star operator * as a syntactic sugar for Unpack, which required minor grammar changes and implementing __iter__ for generic aliases, given * calls __iter__ on the callee . For List, Tuple and old-schools generics, changing typing._GenericAlias is enough, but list and the rest of buit-int containers types need changing types.GenericAlias in CPython. This has implications outside type contexts, see details here.

>>> Tuple[int, str], type(Tuple[int, str]), [*Tuple[int, str]]
(typing.Tuple[int, str] <class 'typing._GenericAlias'> [*typing.Tuple[int, str]])

>>> list[str], type(list[str]), [*list[str]], [Unpack[list[str]]]
(list[str] <class 'types.GenericAlias'> [*list[str]] [*list[str]])

When unpacking, GenericAlias.__iter__ simply returns another instance marked as unpacked.

>>> type(list(List[str])[0]), hasattr(list(List[str])[0], '__unpacked__')
(<class 'typing._UnpackGenericAlias'>, False)  

>>> type(list(list[str])[0]), list(list[str])[0].__unpacked__
(<class 'types.GenericAlias'>, True)

In conclusion, the reason why GenericAlias instances are iterables in Python 3.11 is due to the implementation of the Unpack operator, which required minor grammar changes and implementing __iter__ for generic aliases. For further details and implications, please refer to the resources provided in the answer.