I want to write a numpy gufunc python extension using the numpy C-api that takes two matrices of arbitrary dimensionality, takes the mean across one dimension, and then subtracts one result from the other. The simplest case of this is two 1d array with a signature '(i), (j) -> ()' which returns a scalar.
Any additional dimensions can be assumed to be core dimensions and therefore the same. For example, a 2d array signature might be '(n, i), (n, j) -> (n)' where the core dimension is axis=0 and would simply loop the function over that axis. Here is what I have so far:
#include <Python.h>
#include <numpy/arrayobject.h>
#include <numpy/ufuncobject.h>
#include <math.h>
static void mean_diff(char **args,
const npy_intp *dimensions,
const npy_intp* steps,
void* extra) {
npy_intp i;
npy_intp n1 = 0, n2 = 0;
double s1 = 0.0, s2 = 0.0;
char *in1 = args[0], *in2 = args[1], *out = args[2];
npy_intp len1 = dimensions[0], len2 = dimensions[1];
for (i = 0; i < len1; i++) {
double val = *((double *)in1);
if (!isnan(val)) {
s1 += val;
n1++;
}
in1 += steps[0];
}
for (i = 0; i < len2; i++) {
double val = *((double *)in2);
if (!isnan(val)) {
s2 += val;
n2++;
}
in2 += steps[1];
}
double mean1 = (n1 > 0) ? s1 / n1 : 0.0;
double mean2 = (n2 > 0) ? s2 / n2 : 0.0;
*((double *)out) = mean1 - mean2;
}
The problem I have is even a simple set of 2 1d arrays only ever considers the first element of each array. For example:
>>> mean_diff([1.,2.,3.,4.], [2.,7.,29.,3.])
-1.0
>>> np.mean([1.,2.,3.,4.]) - np.mean([2.,7.,29.,3.])
-7.75
I assume this has to do with grabbing the wrong dimensions or something, but I can't figure out what.
The first thing to know is that your function is expected to compute the result multiple times in each call. The number of computations is stored in
dimensions[0]. You specified the gufunc signature(i),(j)->(). When the function is called,dimensions[1]anddimensions[2]will holdiandjof the gufunc signature (i.e. the lengths of the 1-d inputs of the core operation).For example, in this code snippet,
your function will be called once, with
dimensions[0]set to 2.dimensions[1]anddimensions[2]will be 5 and 3, respectively. Your outer loop will compute the result for the pairs([0, 1, 2, 3, 4], [-1, -3, 7])and then for([0, 0, 0, 1, 1.5], [2, 2, 2]).args[0]andargs[1]will point the first element of the input arrays, andargs[2]will point to the first element of the output array. Your code doesin1 = args[0],in2 = args[1]andout = args[2], so it must compute the result for the data pointed to byin1andin2, and store the result in*out. Then it must incrementin1,in2andoutto point to the next set of data and do the computation again. So you'll implement an outer loop that performs the computation for each input pair and increments the pointers to move to the next set of inputs and output. The amounts to incrementin1,in2andoutare stored insteps[0],steps[1]andsteps[2], respectively. (For example,steps[0]is the number of bytes to jump in memory to get from the first row ofxto the second row.)Then, within this outer loop, you do the computation on the current pair of arrays from
xandy(pointed to byin1andin2). In general, you can't assume that the elements of the arrays are contiguous, so when computing the means, you have to increment the pointers to the data by the appropriate step sizes; in this case, those aresteps[3]andsteps[4].Here's one way you could implement this. I have commented the variables that I pull out of
dimensionsandstepsto help clarify my terse description above.By the way, don't be surprised if a gufunc implemented like this is actually slower than computing the means with the
meanmethod and subtracting the results. NumPy is continually improving its performance, and with the recent SIMD implementations, naive implementations of ufuncs and gufuncs can be slower than the composition of a few calls of optimized NumPy methods.