How to iterate numpy array (of tuples) in list manner

356 Views Asked by At

I am getting an error TypeError: Iterator operand or requested dtype holds references, but the REFS_OK flag was not enabled when iterating numpy array of tuples as below:

import numpy as np

tmp = np.empty((), dtype=object)
tmp[()] = (0, 0)
arr = np.full(10, tmp, dtype=object)

for a, b in np.nditer(arr):
    print(a, b)

How to fix this?

2

There are 2 best solutions below

0
hpaulj On BEST ANSWER
In [71]: tmp = np.empty((), dtype=object)
    ...: tmp[()] = (0, 0)
    ...: arr = np.full(10, tmp, dtype=object)

You don't need nditer to iterate through this array:

In [74]: for i in arr:print(i)
(0, 0)
(0, 0)
(0, 0)
...
(0, 0)

nditer just makes life more complicated, and isn't any faster, especially for something like print. Who or what recommended nditer?

For that matter, you can simply print the array:

In [75]: arr
Out[75]: 
array([(0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0), (0, 0),
       (0, 0), (0, 0)], dtype=object)

But let's look at something else - the id of elements of this object dtype array:

In [76]: [id(i) for i in arr]
Out[76]: 
[1562261311040,
 1562261311040,
 ...
 1562261311040,
 1562261311040]

You made an array with 10 references to the same tuple. Is that what you intended? It's the full that has done that.

To make a different tuple in each slot, I was going to suggest this list comprehension, but then realized it just produced a 2d array:

In [83]: arr1 = np.array([(0,0) for _ in range(5)]); arr1
Out[83]: 
array([[0, 0],
       [0, 0],
       [0, 0],
       [0, 0],
       [0, 0]])

To make an object dtype array with actual tuples (different) we have to do something like:

In [84]: arr1 = np.empty(5, object); arr1
Out[84]: array([None, None, None, None, None], dtype=object)    
In [85]: arr1[:] = [(0,i) for i in range(5)]    
In [86]: arr1
Out[86]: array([(0, 0), (0, 1), (0, 2), (0, 3), (0, 4)], dtype=object)

But that brings us back to the basic question - why make an array of tuples in the first place? What's the point. numpy is best with multidimensional numeric arrays. Object dtype array are, in many ways, just glorified (or debased) lists.

0
Brian61354270 On

The error message actually tells you the problem and the fix: you're attempting to iterate over an array of reference types (i.e. objects, dtype=object), but you didn't enable np.nditer to iterate over reference types.

To fix that set the REFS_OK flag when calling np.nditer:

for x in np.nditer(arr, flags=["refs_ok"]):
    ...

Do note that there is also a second issue in your code. After making this fix, np.nditer is going to yield references a zero-dimensional array, which can't be unpacked into a, b (even though that zero-dimensional array contains a 2-tuple). Instead, you can extract the tuple using .item() and and unpack it in the loop body:

for x in np.nditer(arr, flags=["refs_ok"]):
    a, b = x.item()