This code aims to compute the rolling average of a signal for a range of window widths (i.e. how many points are averaged), and then calculate the sum of all pairwise differences between averages at each window width.
import numpy as np
testData = np.random.normal(0,1,10000)
windowWidths = np.arange(1,1000)
sliding_averages = []
diffs = []
for windowWidth in windowWidths:
# Rolling average using convolution
sliding_average = np.convolve(testData, np.ones(windowWidth) / windowWidth, mode='valid')
# All pairwise differences
pairwiseDiffs = (sliding_average[:,None] - sliding_average[None,:])
# Define mask to extract only one corner of difference matrix
mask = np.triu(np.ones_like(pairwiseDiffs, dtype=bool), k=1)
pairwiseDiffs = pairwiseDiffs[mask]
pairwiseDiffsSqrd = pairwiseDiffs**2
diffs.append(np.sum(pairwiseDiffsSqrd))
This is aiming to be a component in reproducing the computation described in this section of a paper:
Calculation:

My question is whether there is a more efficient way to run this calculation.
I have tried vectorizing further and replacing convolution steps with other approaches, but am not confident in the result.
While I can't fix any further bugs that you might have found during profiling, I can answer your original question regarding the performance issues.
Looking at your code it seems like you're interested in the squared differences which is coincidentally similar to the mathematical definition of variance. If we do shift around the variables in how variance is being calculated we can achieve a formula where the squares are calculated first and only then the difference:
Now looking back at your code:
The commented out part should behave the same way no matter whether we calculate the differences first or the square first:
However we cannot forget that population variance formula divides the sum by the number of observations and that is still completely missing from our code so we add that back in to get:
As expected, using set seed, this returns the same diffs list with sub-second execution time on my machine. Avoiding masking or matrix operations altogether is often possible when working with statistical calculations and will almost always be the fastest possible way to compute the result.