Efficient shift and roll in numpy without pd.Series

65 Views Asked by At

Consider the code below which gives the wanted output:

import numpy as np
import pandas as pd
sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
series = pd.Series(arr)
shifted = pd.Series(np.append(series, np.zeros(touchdown)))
rolled = shifted.rolling(window=sumvalues, min_periods=1).sum().fillna(0).astype(float)[touchdown:]
print(rolled.values)

As you can see, i want to shift my values by "touchdown" spots backwards and then compute for every entry the sum of the "sumvalues" preceding entries.

The issue with the code above is that it is slow, e.g we are creating a whole series object just to perform the rolling. Is there any smart(fast) way of achieving the same operations as above?

Tried to play around with the numpy roll function but it is a bit different, also tried the shift in pandas but seems inefficient.

3

There are 3 best solutions below

0
mozway On

You can use a sliding_window_view with pad:

from numpy.lib.stride_tricks import sliding_window_view as swv

sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])

out = swv(np.pad(arr, (sumvalues-1, touchdown)),
          sumvalues).sum(axis=1)[touchdown:]

Which you can further optimize to:

diff = sumvalues-touchdown
out = swv(np.pad(arr[max(0, 1-diff):], (max(0, diff-1), touchdown)),
          sumvalues).sum(axis=1)

Output:

array([ 7,  9, 11, 13,  7,  0,  0])

Output with sumvalues = 5 ; touchdown = 0:

array([ 1,  3,  6, 10, 15, 20, 25])

Output with sumvalues = 3 ; touchdown = 1:

array([ 3,  6,  9, 12, 15, 18, 13])
1
Onyambu On

You can make use of convolve after padding the array with zeros

a1 = np.convolve(np.append(arr, np.zeros(touchdown)), np.ones(sumvalues))
a1[touchdown:touchdown + arr.size]

array([ 7.,  9., 11., 13.,  7.,  0.,  0.])

NB: In testing the speed of the various methods, the pandas method that OP has seems to outperform the rest when the sumvalues and touchdown are significantly large. Also it still at par with the rest when the values are small. I believe OP should stick to using pandas

1
Soudipta Dutta On

Using padding and convolution

import pandas as pd
import numpy as np

sumvalues = 2
touchdown = 3
arr = np.array([1, 2, 3, 4, 5, 6, 7])
#method 1 : 
# Pad the array with zeros at the beginning for rolling window calculation
padded_arr = np.pad(arr, (touchdown, 0), mode='constant') #[0 0 0 1 2 3 4 5 6 7]

# rolling sum with convolution
rolled = np.convolve(padded_arr, np.ones(sumvalues), mode='valid')[touchdown:]

print(rolled)
"""
[ 3.  5.  7.  9. 11. 13.]
"""