I have two ndarrays with shapes:
A = (32,512,640)
B = (4,512)
I need to multiply A and B such that I get a new ndarray:
C = (4,32,512,640)
Another way to think of it is that each row of vector B is multiplied along axis=-2 of A, which results in a new 1,32,512,640 cube. Each row of B can be looped over forming 1,32,512,640 cubes, which can then be used to build C up by using np.concatenate or np.vstack, such as:
# Sample inputs, where the dimensions aren't necessarily known
a = np.arange(32*512*465, dtype='f4').reshape((32,512,465))
b = np.ones((4,512), dtype='f4')
# Using a loop
d = []
for row in b:
d.append(np.expand_dims(row[None,:,None]*a, axis=0))
# Or using list comprehension
d = [np.expand_dims(row[None,:,None]*a,axis=0) for row in b]
# Stacking the final list
result = np.vstack(d)
But I am wondering if it's possible to use something like np.einsum or np.tensordot to get this vectorized all in one line. I'm still learning how to use those two methods, so I'm not sure if it's appropriate here.
Thanks!
We can leverage
broadcastingafter extending the dimensions ofBwithNone/np.newaxis-With
einsum, it would be -There's no sum-reduction happening here, so
einsumwon't be any better than theexplicit-broadcastingone. But since, we are looking for Pythonic solution, that could be used, once we get past its string notation.Let's get some timings to finish things off -
Leverage
multi-coreWe could leverage multi-core capability of
numexpr, which is suited forarithmetic operationsandlarge dataand thus gain some performance boost here. Let's time with it -In one-line as :
ne.evaluate('A*B4D',{'A':A,'B4D' :B[:,None,:,None]}).Related poston how to control multi-core functionality.