Efficient Generation of Random Orthogonal Matrix in Python

1.5k Views Asked by At

I need to generate a lot of random mean-invariant orthogonal matrices for my work. A mean-invariant matrix has the property A*1_n=1_n, where 1_n is a vector of size n of the scalar 1, basicaly np.ones(n). I use Python and specifically Numpy to create my matrices, but I want to make sure that my method is both correct and the most efficient. Also, I want to present my findings on 3 separate orthogonalization methods I have tried and hopefully get an explanation on why one method is faster than the others. I ask four questions in the end of the post about my findings.

In general, in order to create a mean-invariant random orthogonal matrix A, you need to create a random square matrix M1, replace its first column with a column of ones and orthogonalize the matrix. Then, you do that again with another matrix M2 and the final mean-invariant random orthogonal matrix is A = M1*(M2.T). The bottleneck of this process is the orthogonalization. There are 3 main ways of orthogonalization, namely the Gram–Schmidt process, which uses projection, the Householder transformation, which uses reflection and the Givens rotation.

The creation of a nxn random matrix is pretty straight forward with Numpy: M1 = np.random.normal(size=(n,n)). Then, I replace the first column of M1 with 1_n.

As far as I know the Gram–Schmidt process does not exist in any very popular library, so I found this code which works fine:

def gram_schmidt(M1):
    """Orthogonalize a set of vectors stored as the columns of matrix M1."""
    # Get the number of vectors.
    n = M1.shape[1]
    for j in range(n):
        # To orthogonalize the vector in column j with respect to the
        # previous vectors, subtract from it its projection onto
        # each of the previous vectors.
        for k in range(j):
            M1[:, j] -= np.dot(M1[:, k], M1[:, j]) * M1[:, k]
        M1[:, j] = M1[:, j] / np.linalg.norm(M1[:, j])
    return M1

Obviously, the above code has to be executed for both M1 and M2.

For a 10,000x10,000 random mean-invariant orthogonal matrix, this process takes about 1 hour on my computer (8-core @3.7GHz, 16GB RAM, 512GB SSD).

I found that instead of the Gram-Schmidt process I could orthogonalize the matrix in Numpy with this: q1, r1 = np.linalg.qr(M1) where q1 is the orthogonalized matrix and r1 is an upper triangular matrix (we don't need to keep r1). I do the same to M2 and get q2. Then, A=q1*(q2.T). This process for the same 10,000x10,000 matrix takes about 70 seconds on the same computer. I believe the linalg.qr() library uses the Householder transformation, but I would like someone to confirm it.

Finaly, I tried to change the way that the initial random matrices M1 and M2 are produced. Instead of M1 = np.random.normal(size=(n,n)) I used the Dirichlet distribution: M1 = np.random.default_rng().dirichlet(np.ones(n),size=(n)). Then, I used the linalg.qr() like before and I got the 10000x10000 matrix in about the same time as M1 = np.random.normal(size=(n,n)).

My questions are:

  1. Does the Numpy's np.linalg.qr() method actually use the Householder transformation? Or possibly the the Givens rotation?
  2. Why is the Gram-Schmidt process so much slower than np.linalg.qr()?
  3. I know that the Dirichlet process produces an almost orthogonal matrix. Is it because we are creating 10,000 dimensions, so it is likely to randomly get a vector that is orthogonal to all others? The np.linalg.qr() does not care about how close the matrix is to orthogonality.
  4. Is there an even faster way to generate random orthogonal matrices? Are there any optimizations I can make to my code to make it faster/more efficient?

Edit: The cupy's cp.linalg.qr() on the same 10,000x10,000 random matrix takes only 16 sec on my 2080ti, instead of 70 sec for numpy on the CPU (8-core @3.7GHz with multithreading, 16GB RAM and 512GB SSD).

1

There are 1 best solutions below

0
dmuir On

Here's another method for manufacturing such matrices. I have no idea what the distribution is like, but it might be faster than the methods you describe.

First off, here a householder reflector is a matrix of the form

H = I - 2*h*h'/(h'*h)

where h is a vector.

Note that:

H is orthogonal, and symmetric

H can be applied to a vector in O(dim)

if x and y are any two vectors with the same norm then we can find such a matrix to map x to y (h is x-y)

One way to manufacture 'random' orthogonal matrices is by computing products of 'random' Householder matrices

If Q is an orthogonal matrix and u is the vector of all 1's and

q = Q*u 

Then q and u have the same norms, so if H is the householder reflector that maps q to u,

R = H*Q is orthogonal
R*u = H*Q*u = H*q = u