Is there a most efficient way to multiply three matrices A * B * C = D using cuBLAS?

187 Views Asked by At

I want to find the most efficient way to multiple three matrices using cuBLAS. My current solution has the obvious multiple calls to cublasgemm

cublas<t>gemm(cublasH, transa, transb, m, n, k, &alpha, d_A, lda, d_B, ldb, &beta, d_AB, ldc)
cublas<t>gemm(cublasH, transb, transc, m, n, k, &alpha, d_AB, ldab, d_C, ldc, &beta, d_D, ldd)

It's not a my opinion that this a bad solution. Only that it would be better were there some way to do with a single kernel/function call rather than 2, as a single kernel would presumably get a bit more speed up.

I've looked at cublasgemmBatched hoping there were some manipulation to be made, but it's stated that the multiplications must be independent from each other, so that seems off the table.

Is there some way to use cuBLAS or some other mathematical shortcut worth trying to achieve this optimization?

1

There are 1 best solutions below

0
Robert Crovella On

CUBLAS doesn't have any direct support for this (a single function call that accepts 3 matrices to be multiplied together.)

The way to do it in CUBLAS is the way you have already indicated.