Stan: Exceptions returned: “wishart_lpdf: random variable is not symmetric” and “gamma_lpdf: Random variable is inf, but must be positive finite!”

Question

Stan: Exceptions returned: “wishart_lpdf: random variable is not symmetric” and “gamma_lpdf: Random variable is inf, but must be positive finite!”

55 Views Asked by Pål Bjartan At 22 September 2023 at 09:53

I am currently trying to translate and fit a SEM using CmdStan and the R package cmdstanr (see below). The model was originally written for WinBUGS (code at the bottom of page). The current model will run to completion, however, I get the following exceptions returned in the output, repeatedly, when fitting the model (lines causing exceptions are marked with comments in the code sample below):

Chain 1 Exception: wishart_lpdf: random variable is not symmetric. random variable[1,2] = -inf, but random variable[2,1] = -inf (in '/tmp/RtmpJyfIeK/model-a0b05229fa6a6.stan', line 48, column 4 to column 24)

Chain 1 Exception: gamma_lpdf: Random variable[1] is inf, but must be positive finite! (in '/tmp/RtmpJyfIeK/model-a0b05229fa6a6.stan', line 46, column 4 to column 22)

Chain 3 Exception: gamma_lpdf: Random variable is inf, but must be positive finite! (in '/tmp/RtmpJyfIeK/model-a0b05229fa6a6.stan', line 47, column 4 to column 22)

All messages are preceded and followed by the following messages, respectively:

Chain X Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:

and

Chain X If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,

Chain X but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Questions:

I am fairly new to Stan, and could use some help understanding what's causing these errors. How can I avoid these messages?
The messages gives me the notion that the code might not optimal. I'd like some input on how to write the code with improved efficiency in mind.
With regards to improved efficiency, I was hoping running the model on the GPU would speedup the computations. In reality, running the model below on the GPU is extremely slow: I.e. I interrupted after >4 mins, as it had not completed the first 500 iterations yet, while the model fit finished on the CPU after only 26 seconds! (In contrast, the model from the OpenCL example vignette on the official pages will run ~3.4 times faster on my GPU.) I am trying to understand whether this unexpected discrepancy is caused by poorly optimized or erroneous code, or if the CPU is simply better suited for fitting the current model.

Stan model code:

data {
    int<lower=0> N; // number of observations
    int<lower=1> P; // number of variables
    array[N] vector[P] y; // observed data
}

transformed data {
    vector[2] u = rep_vector(0, 2); // prior mean for xi
    matrix[2, 2] R = diag_matrix(rep_vector(1, 2)); // prior covariance matrix for xi
}


parameters {
    
    vector[P] alp; // intercepts
    array[6] real lam; // loadings
    vector[N] eta; // latent variable
    array[N] vector[2] xi; // latent variables
    // matrix[N, 2] xi; // latent variables
    row_vector[2] gam; // coefficients for nu[i]
    vector<lower=0>[P] psi; // precisions of y
    real<lower=0> psd; // precision of eta
    cov_matrix[2] phi;

}

transformed parameters {
    vector[P] sgm = 1/psi;
    real sgd = 1/psd;
    matrix[2, 2] phx = inverse(phi);   
}


model {
    // Local variables
    array[N] vector[P] mu;
    array[N] vector[P] ephat;
    vector[N] nu;
    vector[N] dthat; 
    

    // Priors on intercepts
    alp ~ normal(0, 1);

    // Priors om precisions
    psi ~ gamma(9, 4);                              // line 46
    psd ~ gamma(9, 4); // precision of eta          // line 47
    phi ~ wishart(5, R);                            // line 48
    
    // Priors on loadings and coefficients
    lam[1] ~ normal(0.8, psi[2]);
    lam[2] ~ normal(0.8, psi[3]);
    lam[3] ~ normal(0.8, psi[5]);
    lam[4] ~ normal(0.8, psi[6]);
    lam[5] ~ normal(0.8, psi[8]);
    lam[6] ~ normal(0.8, psi[9]);
    
    gam ~ normal(0.5, psd);

    // Measurement equation model
    
    for (i in 1:N) {
        
        mu[i,1] = eta[i] + alp[1];
        mu[i,2] = lam[1] * eta[i] + alp[2];
        mu[i,3] = lam[2] * eta[i] + alp[3];
        mu[i,4] = xi[i, 1] + alp[4];
        mu[i,5] = lam[3] * xi[i, 1] + alp[5];
        mu[i,6] = lam[4] * xi[i, 1] + alp[6];
        mu[i,7] = xi[i, 2] + alp[7];
        mu[i,8] = lam[5] * xi[i, 2] + alp[8];
        mu[i,9] = lam[6] * xi[i, 2] + alp[9];

        ephat[i] = y[i] - mu[i];
        y[i] ~ normal(mu[i], psi);
    
    // Structural equation model
        
        nu[i] = gam * xi[i];
    }

    dthat = eta - nu;

    xi ~ multi_normal(u, phi);
    eta ~ normal(nu, psd);
}

The original BUGS model:

model  <- function() {
    for(i in 1:N){
        #measurement equation model
        for(j in 1:P){
            y[i,j] ~ dnorm(mu[i,j],psi[j])
            ephat[i,j] <- y[i,j] - mu[i,j]
        }
        mu[i,1] <- eta[i]+alp[1]
        mu[i,2] <- lam[1] * eta[i] + alp[2]
        mu[i,3] <- lam[2] * eta[i] + alp[3]
        mu[i,4] <- xi[i,1] + alp[4]
        mu[i,5] <- lam[3] * xi[i,1] + alp[5]
        mu[i,6] <- lam[4] * xi[i,1] + alp[6]
        mu[i,7] <- xi[i,2] + alp[7]
        mu[i,8] <- lam[5] * xi[i,2] + alp[8]
        mu[i,9] <- lam[6] * xi[i,2] + alp[9]

        #structural equation model
        xi[i,1:2] ~ dmnorm(u[1:2], phi[1:2,1:2])
        eta[i] ~ dnorm(nu[i], psd)
        nu[i] <- gam[1] * xi[i,1] + gam[2] * xi[i,2]
        dthat[i] <- eta[i] - nu[i]
    } #end of i

    #priors on intercepts
    for(j in 1:9){alp[j]~dnorm(0.0, 1.0)}

    #priors on loadings and coefficients
    lam[1] ~ dnorm(0.8, psi[2])
    lam[2] ~ dnorm(0.8, psi[3])
    lam[3] ~ dnorm(0.8, psi[5])
    lam[4] ~ dnorm(0.8, psi[6])
    lam[5] ~ dnorm(0.8, psi[8])
    lam[6] ~ dnorm(0.8, psi[9])
    for(j in 1:2){ gam[j] ~ dnorm(0.5, psd) }
    
    #priors on precisions
    for(j in 1:P){
        psi[j] ~ dgamma(9.0, 4.0)
        sgm[j] <- 1/psi[j]
    }
    psd ~ dgamma(9.0, 4.0)
    sgd <- 1/psd
    phi[1:2,1:2] ~ dwish(R[1:2,1:2], 5)
    phx[1:2,1:2] <- inverse(phi[1:2, 1:2])

} #end of model

LISREL model to be estimated:

(From: Lee, S. Y. (2007). Structural equation modeling: A Bayesian approach`,. ch. 4.5, pp. 99 & 101. John Wiley & Sons.)

System info

R: 4.3.1
CmdStan: 2.33.0
cmdstanr: 0.6.1

OS: Arch Linux x86_64 
CPU: AMD Ryzen 7 5800H with Radeon Graphics (16) @ 4.463GHz [45.8°C] 
GPU: NVIDIA GeForce RTX 3070 Mobile / Max-Q 
Memory: 13.87GiB / 31.20GiB 
Kernel: Linux 6.5.3-arch1-1 
GPU Driver: NVIDIA 535.104.05

Original Q&A

Stan: Exceptions returned: “wishart_lpdf: random variable is not symmetric” and “gamma_lpdf: Random variable is inf, but must be positive finite!”

Questions:

Stan model code:

The original BUGS model:

LISREL model to be estimated:

System info

There are 0 best solutions below

Related Questions in R

Related Questions in STAN

Trending Questions

Popular # Hahtags

Popular Questions