Why dot product of two arrays produces a scalar value but dot product of transposed array produce a matrix

63 Views Asked by At

In numpy, I realized the following two calculations produce different results.

a = np.array([1,2,3,4])
b = np.array([1,2,3,4])
dot_product1 = np.dot(a, b)    <--- I think this should be an error
print(dot_product1)

a = np.array([1,2,3,4])
b = np.array([1,2,3,4]).reshape(-1,1)
dot_product2 = np.dot(a, b)
print(dot_product2)

The dot_product1 is a scalar value 30, but the dot_product2 is a 1x1 matrix, [30].

My understanding of linear algebra is that we cannot calculate dot product of a 1 x 4 matrix with another 1 x 4 matrix. I expect the third line fail but it is successful.

The second part of the code calculates a 1 x 4 matrix and a 4 x 1 matrix, which produces a 1 x 1 matrix. This is what I expected.

Can someone help explain what is the difference between these to calculations?

1

There are 1 best solutions below

1
hpaulj On

Did you read the np.dot docs? Pay attention to what it says about 1d arguments?

In [209]: a = np.array([1,2,3,4])
     ...: b = np.array([1,2,3,4])
     ...: dot_product1 = np.dot(a, b)
     ...: print(dot_product1, type(dot_product1))
     ...: 
     ...: a = np.array([1,2,3,4])
     ...: b = np.array([1,2,3,4]).reshape(-1,1)
     ...: dot_product2 = np.dot(a, b)
     ...: print(dot_product2, type(dot_product2), dot_product2.shape)
30 <class 'numpy.int32'>
[30] <class 'numpy.ndarray'> (1,)

In [210]: a.shape, b.shape
Out[210]: ((4,), (4, 1))

The first does produce a scalar, an inner product. Same as np.sum(a*b). (Corrected)

It's the dot of a (4,) element array with another (4,) array These are not (1,4) 'row vectors'.

The second combines a (4,) with a (4,1), producing a (1,) shape. Not a 1x1!

If you want a (1,1) dot a (1,4) with a (4,1)

In [211]: (a[None,:]@b).shape
Out[211]: (1, 1)

One 'dot product' page says it can be calculated as Algebraically, the dot product is defined as the sum of the products of the corresponding entries of the two sequences of numbers. That's exactly what your first example does.

Matrix multiplication can be thought of as the application of the dot product to all row/column combinations of a 2 matrices. That's what np.dot does, with the intermediate option of working with a 2d and a 1d array (your second example).

np.dot can also work with 3+d arrays, though the matmul version is generally more useful.

If you want a further challenge, look at np.einsum, which applies 'Einstein notation' to these multidimensional products.

Generally, in np.dot(A,B), the last dimension of A pairs with the 2nd to the last dimension (or only dimension if 1d) of B. In einsum terms I like to think of that as the 'sum of products' dimension.

https://mkang32.github.io/python/2020/08/23/dot-product.html#:~:text=Matrix%20multiplication%20is%20basically%20a,of%20vectors%20in%20each%20matrix.

Matrix multiplication is basically a matrix version of the dot product. Remember the result of dot product is a scalar. The result of matrix multiplication is a matrix, whose elements are the dot products of pairs of vectors in each matrix.