Context
I have just read the PEP 586. In the motivation the authors say the following:
numpy.unique will return either a single array or a tuple containing anywhere from two to four arrays depending on three boolean flag values.
(...)
There is currently no way of expressing the type signatures of these functions: PEP 484 does not include any mechanism for writing signatures where the return type varies depending on the value passed in.
We propose adding Literal types to address these gaps.
But I don't really get how adding Literal types helps with that. And also I wouldn't agree with the statement that
PEP 484 does not include any mechanism for writing signatures where the return type varies depending on the value passed in.
As far as I understand, Union can be used in such a case.
Question
How can the numpy.unique return type be annotated with Literal?
The problem that
LiteralsolvesYou took exactly the right passage from PEP 586. To highlight the two crucial words here one more time, this is about
That is one of the applications for the
Literaltype. And the statement is in fact correct.Could you annotate a function that returns one of two different types under various (not further defined) circumstances before? Sure, as you correctly pointed out, a
Unioncan be used for that.Could you annotate a function that returns one of two different types depending on different argument types (or combinations thereof) passed into it? Yes, that is what the
@overloaddecorator is for.But annotate a function that returns one of two different types depending on the value of an argument passed into it? This was not possible before
Literal.To accomplish that, we now use
Literalin combination with the@overloaddecorator. Consider the following example before we get tonp.unique.Simple example
Say I have a very silly function
doublethat doubles afloatpassed as an argument to it. But it can return either afloatagain or return it as astr, if a special flag is set as well:Now, this annotation is perfectly fine. The return type captures both possible situations, the
floatand thestrbeing returned.But, say now I have another function that accepts only a
str:What do I do, if I want to pass the output of
doubleas argument toneed_str?This is a problem for a strict type checker. Though the code runs fine because as we know since we pass
as_string=True, theoutputis a string. The static type checker (mypyhere) only sees the return type of the first and the parameter type of the second function and rightfully complains:It sees that
outputcould well be afloat. It doesn't know whatdoubledoes inside. How do we fix that? Well, beforeLiteral, the simplest solution I can think of would have been to do something like this:That is reasonable, satisfies the type checker and gets the job done.
But now that we have
Literal, we can solve this (arguably) much more elegantly:Now, if I try this again, the type checker understands the specific call to
double, infers the returned value to be of typestrand considers the next function call to be type safe:Adding
reveal_type(output)makesmypytell usRevealed type is "builtins.str".I hope this illustrates the capabilities this introduces and that they did not exist before. There are other things you can do with
Literal, but that is offtopic.How this helps
np.uniqueAs the documentation you linked reveals,
np.uniquehas essentially four different possible return types:dtypeasardtypeasarfollowed by one integer arraydtypeasarfollowed by two integer arraysdtypeasarfollowed by three integer arraysWhich type it is (as well as the meaning of the values) depends entirely on the values passed to the parameters
return_index,return_inverse, andreturn_counts:False(default)TrueTrueTrueThus, the situation is analogous to the simple example from above. It's just that there are a lot more
@overloadsto define, since we have 23 = 8 combinations of arguments to reflect in our calls.Now, if I had too much time on my hands and wanted to write a useless wrapper around
np.unique, I would demonstrate howLiteralcan be used to properly annotate all different call variations and satisfy even the strictest type checker...*sigh*
A useless wrapper around
np.uniqueIt is worth noting that with such extensive overloading, the possibilities are theoretically much greater. If it so happened that one of the options would produce an array of yet another different
dtypeof elements, we could still properly annotate that case here.It is also worth mentioning that IMHO this goes too far. I don't think this is good style. A function should not have this many fundamentally distinct call signatures. It's what some would call "code smell"...
But as for the typing capabilities, I say better have it and don't need it than the other way around.
Hope this helps.