How to speed up Numpy

Question

How to speed up Numpy

138 Views Asked by jasmine At 26 February 2024 at 20:49

I am trying to do the following using numpy. Because the size of aa is large, using numpy is slow. I am trying to speed it up by using numba, there is some improvement, but I would like speed it further, because it is a part of another loop. Any advice is much appreciated!

Using numpy:

def get_prob(aa):
    allmax = aa.max(axis=1)[:, None]
    findmax = aa - allmax
    mask = ((findmax[:,1,:]==0)&(findmax[:,2,:]==0))
    findmax[:, 1, :][mask] = -1

    mask = ((findmax[:, 0, :] == 0) & (findmax[:, 1, :] == 0))
    findmax[:, 0, :][mask] = -1

    mask = ((findmax[:, 0, :] == 0) & (findmax[:, 1, :] == 0) & (findmax[:, 2, :] == 0))
    findmax[:, 0, :][mask] = -1
    findmax[:, 1, :][mask] = -1

    p = np.where(findmax < 0, 0.0, 1.0).transpose(0,2,1)
    return p

Using numba:

@numba.jit(nopython=True)
def get_prob_nb(aa,num_params,num_action):
    p=np.zeros_like(aa)

    for i in range(num_params):
        for j in range(num_action):
            a1 = aa[i, 0, j]
            a2 = aa[i, 1, j]
            a3 = aa[i, 2, j]
            if a1>a2 and a1>a3:
                p[i, 0, j] = 1.
            elif a2>=a1 and a2>a3:
                p[i, 1, j] = 1.
            elif a3>=a2 and a3>=a1:
                p[i, 2, j] = 1.

    p = p.transpose(0, 2, 1)
    return p

aa=rng.uniform(0.0, 1.0, 9000000)
aa=aa.reshape(1000,3,3000)
start = time.time()
get_prob_nb(aa, 1000, 3000)
print("elapse", time.time()-start)

Original Q&A

There are 2 best solutions below

**Saul Aryeh Kohn** · Answer 1 · 2024-02-26T21:29:11.700000

There's a surprisingly simple way to parallelize your numba call:

@numba.jit(nopython=True, parallel=True)
def get_prob_nb_parallel(aa, num_params, num_action):
    p = np.zeros_like(aa)

    for i in numba.prange(num_params):
        for j in range(num_action):
            a1 = aa[i, 0, j]
            a2 = aa[i, 1, j]
            a3 = aa[i, 2, j]
            if a1 > a2 and a1 > a3:
                p[i, 0, j] = 1.
            elif a2 >= a1 and a2 > a3:
                p[i, 1, j] = 1.
            elif a3 >= a2 and a3 >= a1:
                p[i, 2, j] = 1.

    p = p.transpose(0, 2, 1)
    return p

Running your test, I saw an time improvement of ~30% versus get_prob_nb.

**HOBE** · Answer 2 · 2024-02-29T13:48:20.237000

def get_prob_keepdims(aa):
    max_values = aa.max(axis=1, keepdims=True)
    p = np.equal(aa, max_values).astype(float)
    return p.transpose(0, 2, 1)

The function get_prob_keepdims utilizes the keepdims=True parameter in its computation, which maintains the original array's dimension after performing the max operation across a specific axis. Based on my understanding of the provided code, I believe this function should operate the same as the original get_prob function.

In my 100 iterations test, using keepdims yielded slightly faster results than using Numba without parallel on a Mac M1.

How to speed up Numpy

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in NUMBA

Trending Questions

Popular # Hahtags

Popular Questions