Since Numpy is implemented in the C language for efficiency reasons, I would like to find out how exactly Numpy calls a C function, like np.array, from Python, as in which part of the Numpy source code is responsible for the calling?
I have tried to follow through the source code, the C implementation of multiarraymodule.c in the numpy/_core/src/multiarray directory and I focused specifically on: PyMethodDef array_module_methods[ ] to get an overview of the functionality the module will provide when imported to python and I also focused on:
PyModuleDef moduledef = {
PyModuleDef_HEAD_INIT,
"_multiarray_umath",
NULL,
-1,
array_module_methods,
NULL,
NULL,
NULL,
NULL
}
which is the module name during its initialisation. I have also observed that: _multiarray_umath is imported into: numpy/_core/multiarray.py, the python module defined functions such as:
@array_function_from_c_func_and_dispatcher(_multiarray_umath.empty_like)
def empty_like(
prototype, dtype=None, order=None, subok=None, shape=None, *,device=None)
followed by a lot of documentation and examples, then a return of:
return (prototype,)
I thought this is where C functions implementation must be called from, like of empty_like() function?
They are called from your code, when you write
np.array(). If you try to disassemble anp.arraycall:You get
Just a nice function call, that's all. Now if we try to disassemble the function itself, if it was Python, we would get its bytecode, as above; but we don't:
So Python treats
np.arrayjust like it treats, say,math.sin: it is some binary code in a library, somewhere, that Python loads. Specifically, on my system it is in(On a Windows system, it would be a
.dllfile.) This file is the meat of the packagenumpy.core._multiarray_umath, and it has the functionarraydefined:How does Python know to associate the function
arraythere with the corresponding C code?_multiarray_umathmodule was defined usingPyModule_Createfunction using the definitionhereas having certain methods, namelyarray_module_methods, and among them:referencing the
array_arrayfunction defined here. When the module is imported, the shared library with the appropriate name at the appropriate place inPYTHONPATHis found and dynamically linked, and its functions become accessible to the program. No other Python code is involved in the call itself:numpy.core._multiarray_umath.array()directly calls the binary implementation, just likemath.sin(0)does.Tl;dr: module
numpy.core._multiarray_umathis defined witharrayattribute bound to the C functionarray_arrayby a shared library namednumpy/core/_multiarray_umath...inPYTHONPATH.All of this is described in detail in Extending Python with C or C++ and Building C and C++ Extensions.
We can verify that this is indeed the
numpy.arraywe know and love:Now, how exactly it ends up being also assigned to
numpy.arrayis a series of convoluted imports, getattrs, globals assignments and whatnot, which I don't want to try to trace at the moment, but it has nothing to do with how it is called, nor does it have anything to do with specifically C functions.