Sum each column of a sparse matrix multiplied by a vector

52 Views Asked by Eric Johnson At 22 March 2024 at 17:37

I have a large scipy sparse matrix X.
I have a vector, y with the number of elements matches the number of rows of X.

I want to calculate the sum of of each column after it was multiplies by y.
If X was dense, it is equivalent of np.sum(X * y, axis=0).

How can it be done efficiently for a sparse matrix?

I tried:

z = np.zeros(X.shape[1])

for i in range(X.shape[1]):
  z[i] = np.sum(np.array(X[:, i]) * y)

Yet it was terribly slow.
Is there a better way to achieve this?

There are 1 best solutions below

Onyambu On 22 March 2024 at 17:56 BEST ANSWER

Use dot product provided for sparse matrices:

X.transpose().dot(y)

This should be faster.

Also note that you cannot index a sparse matrix as you wrote in your example. You need to use getcol method.