My code is simply this:
@njit()
def corr(arr: np.ndarray):
return np.corrcoef(arr)
arr = np.random.random([10000, 10000])
corr_matrix = corr(arr)
It takes around 50 seconds to finish on my computer, and just 18 seconds without @njit. If I increase size to 30,000, the function takes forever.
Is there a way to improve performance of numba @njit on np.corrcoef like in this situation or np.corrcoef is already as fast as it can be? I think I'm not understanding numba correctly here because it's a lot slower with @njit than without.
Based on comments, the OP seemed interested in a speedup using a GPU, so I tested CuPy
corrcoefon a Colab T4 GPU instance.If we're concerned about the possibility of caching or lazy evaluation, we can generate new data, run once, and even include a
printstatement for good measure.I also checked the NumPy calculations on a regular CPU instance with both float64 and float32, and the CuPy calculations on the T4 were faster.
Results may vary by GPU, but I figured this comparison was fair since anyone can use Colab (for some amount of time) for free. I don't know the constraints of the OP, but use of a GPU - especially with
float32arithmetic - seems to be a way to speed up correlation correctly calculation for large arrays.