I define a C function with memoryview input to work with a NumPy array, but a pure C-defined temporary float array can not work with 'base_func'. Error:
Operation not allowed without gil
How can I modify C function base_func to work with both the numpy.array and cdef C array?
cdef void base_func(float[:] vec1) noexcept nogil:
return
def python_entry(vec: np.ndarray):
cdef float[:] vec_view = vec
base_func(vec_view)
cdef void cfunc(float[:] vec2) noexcept nogil:
cdef float[10] tmp_vec
base_func(tmp_vec)
Error Message
cdef void cfunc(float[:] vec2) noexcept nogil:
cdef float[10] tmp_vec
base_func(tmp_vec) ^
------------------------------------------------------------
c_test.pyx:21:14: Operation not allowed without gil
Project Idea
I want to cythonize the GROUP BY operation on 1D or 2D np.ndarray. The python interface will be like group_mean(data, group), and group_mean = FuncWrapper(c_group_mean). So I can write other c-functions like c_group_std to implement another python interface group_std
Problems and Resolutions:
- Shape Alignment and NAN input: Randomly, data will be shaped as (m, n) and group as (n, ), I have to align them and use np.where to assign -1 to group where data is NAN.
- Work on 2D array: the c-func will only work on 1D input, so for 2D data, I use prange to operate on each row simultaneously, which needs NOGIL mode.
- Different Shaped Result: For (m, n) shaped input data, output could be shaped as (m, group_number) for statistical function such as group_mean, and as (m, n) for operation function such as group_demean (subtract corresponding group mean), so I have different FuncWrapper.
- Python Object FuncWrapper can't use c function pointer as parameter: create a CFuncWrapper (cdef class) to wrap c functions like c_group_mean, accordind to this post.
- Initialize temporary c array and reuse c function: For example, in c_group_mean, I need two arrays to sum up data and count in each group. So my c function template looks like following, which raise error on Windows 11 but works on macos. And also I can't call c_mean in c_std and pass in my temporary float array mean as result parameter
cdef void c_mean (float[:] data, int[:] group, float[:] result, const int length, const int group_number):
float[group_num] sum_up
int[group_num] count
So my final code should be,
group_mean = FuncWrapper(CFuncWrapper.bind_cfunc_group_num(c_group_mean))
Further Thinking
- Change all float[group_num] sum_up to float* sum_up = <float*>malloc(...), to work on Window 11?
- Change c function template to allow both memoryviewslice and float* to pass in? Or change the architecture to,
1. Python Interface: group_mean
2. FuncWrapper: pass in different base function, like c_group_mean
3. CFuncWrapper: make sure python object FuncWrapper can accept cdef functions
4. TODO: convert MemoryviewSlice to float*?
5. base c functions: c_group_mean, c_group_std