Differrent behavior between numpy arrays and array scalars

74 Views Asked by At

This is a follow-up on this question.

When we use a numpy array with a specific type, it preserves its type following numeric operations.
For example adding 1 to a uint32 array will wrap up the value to 0 if needed (when the array contained the max uint32 value) and keep the array of type uint32:

import numpy
a = numpy.array([4294967295], dtype='uint32')
a += 1   # will wrap to 0
print(a)
print(a.dtype)

Output:

uint32
[0]
uint32

This behavior does not hold for an array scalar with the same type:

import numpy
a = numpy.uint32(4294967295)
print(a.dtype)
a += 1   # will NOT wrap to 0, and change the scalar type
print(a)
print(a.dtype)

Output:

uint32
4294967296
int64

But according to the array scalars documentation:

The primary advantage of using array scalars is that they preserve the array type

...

Therefore, the use of array scalars ensures identical behaviour between arrays and scalars, irrespective of whether the value is inside an array or not.

(emphasys is mine)

My question:
Why do I observe the above different behavior between arrays and scalars despite the explicit documentation that states they should behave identically ?

1

There are 1 best solutions below

2
Matt Haberland On BEST ANSWER

As mentioned in the comments: yes, this documentation is imprecise at best. I think it is referring to the behavior between scalars of the same type:

import numpy
a = numpy.uint32(4294967295)
print(a.dtype)  # uint32
a += np.uint32(1)   # WILL wrap to 0 with warning
print(a)  # 0
print(a.dtype)  # uint32

The behavior of your example, however, will change due to NEP 50 in NumPy 2.0. So as frustrating as the old behavior is, there's not much to be done but wait, unless you want to file an issue about backporting a documentation change. As documented in the Migration Guide.

The largest backwards compatibility change of this is that it means that the precision of scalars is now preserved consistently... np.float32(3) + 3. now returns a float32 when it previously returned a float64.

I've confirmed that in your example, the type is preserved as expected.

import numpy
a = numpy.uint32(4294967295)
print(a.dtype)  # uint32
a += 1  # will wrap to 0 
print(a)  # 0
print(a.dtype)  # uint32
numpy.__version__  # '2.1.0.dev0+git20240318.6059db1'

The second NumPy 2.0 release candidate is out, in case you'd like to try it: https://mail.python.org/archives/list/[email protected]/thread/EGXPH26NYW3YSOFHKPIW2WUH5IK2DC6J/