Indexing numpy array with half-values efficiently

305 Views Asked by At

I would like to index a numpy array with integer and half values. This is roughly what I have in mind:

>>> a = HalfIndexedArray([[0,1,2,3],[10,11,12,13],[20,21,22,23]])
>>> print(a[0,0])
0
>>> print(a[0.5,0.5])
11
>>> print(a[0,1+0.5])
3

Only half-values would be used as indices, so I imagine some kind of wrapper could be constructed that stores the values at integer indices that are accessed by multiplying the given fractional index.

One could construct some kind of helper function that can get and set the values accordingly; however, it would be even better if the native numpy indexing functionality (slices etc.) could be still used. Additionally, significant overhead is not really acceptable, as these arrays would be used for numerical computations.

It is basically some efficient syntactic sugar that I am looking for: is this something that can be achieved in numpy (or more broadly, in Python)?

Edit: To make it clearer: my goal is to use both the integer and the half indices. The whole point is to make the code cleaner, as the half-indices would correspond to half-steps during time-stepping in a numerical solver (where integer-steps also exist). E.g. see here as a simple example where fractional indices are routinely used in mathematical notation.

Edit #2: Per @matszwecja's suggestion, I tried reimplementing numpy.ndarray's __(get/set)item__ functions like this:

import numpy as np

class half_indexed_ndarray(np.ndarray):
    def __getitem__(self, key):
        print('Getting by key {0}'.format(key))
        if isinstance(key, tuple):
            tuple_double = tuple(int(i*2) for i in key)
            return super(half_indexed_ndarray, self).__getitem__(tuple_double)
        if isinstance(key, int) or isinstance(key, float):
            return super(half_indexed_ndarray, self).__getitem__(int(key * 2))
    
    def __setitem__(self, key, value):
        print('Setting by key {0}'.format(key))
        if isinstance(key, tuple):
            tuple_double = tuple(int(i*2) for i in key)
            return super(half_indexed_ndarray, self).__setitem__(tuple_double, value)
        if isinstance(key, int) or isinstance(key, float):
            return super(half_indexed_ndarray, self).__setitem__(int(key * 2), value)

For simple indexing, this does work:

a = half_indexed_ndarray((3,3))
a[0,0]=1
a[0,0.5]=5
a[0.5,0.5]=505
assert a[0,0]==1
assert a[0,0.5]==5
assert a[0.5,0.5]==505

However, indexing ranges does not work yet, and numpy's behaviour is a bit puzzling. For example:

>>> print(a[-3,])

Getting by key 0
Getting by key (-3,)

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-239-23821ec76b06> in <module>
----> 1 print(a[0])

~/anaconda3/lib/python3.8/site-packages/numpy/core/arrayprint.py in _array_str_implementation(a, max_line_width, precision, suppress_small, array2string)
   1504         return _guarded_repr_or_str(np.ndarray.__getitem__(a, ()))
   1505 
-> 1506     return array2string(a, max_line_width, precision, suppress_small, ' ', "")
   1507 
   1508 

<several more calls omitted by me>

<ipython-input-233-eb1d93dc766b> in __getitem__(self, key)
      6         if isinstance(key, tuple):
      7             tuple_double = tuple(int(i*2) for i in key)
----> 8             return super(half_indexed_ndarray, self).__getitem__(tuple_double)
      9         if isinstance(key, int) or isinstance(key, float):
     10             return super(half_indexed_ndarray, self).__getitem__(int(key * 2))

IndexError: index -6 is out of bounds for axis 0 with size 3

My interpretation is that numpy translates the tuple index (0,) to (-3,) for some reason and then calls __getitem__ again. However, during both of these calls, the index is multiplied by two, when it only should be multiplied once. Not sure how this could be circumvented.

2

There are 2 best solutions below

0
Otto Hanski On

My recommendation would be that instead of creating a separate class to account for half-integer indexing, just handle it on the input side. If you take a half-integer index system and multiply your input by two, you can translate that trivially to a normal integer index.

This would likely result in a cleaner and easier to upkeep piece of code.

However, if you want to go ahead and create a custom iterable, this could be helpful: https://thispointer.com/python-how-to-make-a-class-iterable-create-iterator-class-for-it/

6
matszwecja On

In order to change behavior of [] you need to reimplement __getitem__ class method. Since your class would behave as a standard list otherwise, you can do this:

class HalfIndexedList(list):
    def __getitem__(self, key):
        return super().__getitem__(int(key * 2))

a = HalfIndexedList([10,11,12,13,14,15])
for i in range(0, 6):
    print(f"{i/2 = }, {a[i/2] = }") 

(This of course affects only getting items using [] operator, things like value returned by a.index will be unaffected.

However, I agree with @Otto answer and the statement that it's much better to handle that input-side for cleaner code. Half-indexing doesn't make sense and is really unintuitive.

As a side note, indexing 2D array in Python is usually done using a[i][j], not a[i, j] as 2D arrays are effectively lists of lists.