I am trying to write a model using jags. The problem is that I am really new to jags language. I have a dataset that is basically a survey with 6 questions and one predictive variable. All are categorical some of which have 3 possible answers other have 2, (the predictive has 3 possible answers). In order to create a Bayesian model using jags, I decided to use categorical distribution to use in order to model Y (for the likelihood) and dirichlet for the priors.
As I have missing values in both Xs and Y. After writing a model to include and calculate those values, R gives me a fatal error and I can not run the model. I can not find any source online for such a case/model to try and replicate the model into my case. Anyways below I have my model so please If you find anything strange in there please indicate it to me so I fix the problem else if everything with model seems fine and you know what causes the error I would be glad to hear that as well.
Specifications : - OS : MacOS Intel Core i5 - R Studio Version : R 4.1.2 - rjags version : 4-13
Model :
model {
# Likelihood
for (i in 1:209) {
FIN[i] ~ dcat(p[i,1:3])
logit(p[i,1]) <- alpha[1] + beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
logit(p[i,2]) <- alpha[2] + beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
logit(p[i,3]) <- alpha[3] + beta[1]*AGE[i] + beta[2]*SEX[i] + beta[3]*INC[i] + beta[4]*POL[i] + beta[5]*AREA[i] + beta[6]*STAT[i]
# -- Add model for dependent variables --
AGE[i] ~ dcat(p_age)
SEX[i] ~ dbern(p_sex)
INC[i] ~ dcat(p_inc)
POL[i] ~ dcat(p_pol)
STAT[i]~ dbern(p_stat)
}
# Priors
for (j in 1:6) {
beta[j] ~ dnorm(0, 0.001) # Normal prior for beta
}
for (k in 1:3) {
alpha[k] ~ dnorm(0, 0.001) # Normal prior for alpha
}
# -- Priors for dependent variables --
p_age ~ ddirch(c(0.3444976,0.3157895,0.3349282))
p_inc ~ ddirch(c(0.2248804,0.3971292,0.2870813))
p_pol ~ ddirch(c(0.4019139,0.1913876,0.3732057 ))
p_sex ~ dbeta(2,1)
p_stat ~ dbeta(2,1)
Feel free to propose any modification in the model that may work for my case of data.
The first 10 lines of my data are the following:
AGE SEX INC POL AREA FIN STAT
1 3 0 2 2 1 2 1
2 2 0 3 3 1 3 1
3 1 0 2 NA 1 2 1
4 3 1 2 1 1 1 0
5 3 1 3 3 3 2 NA
6 1 0 2 1 3 3 NA
7 3 1 1 3 3 1 1
8 1 0 1 3 2 1 0
9 3 1 NA 3 3 2 0
10 1 0 NA 1 1 2 1
predictive var is FIN and the rest are response ones.
It looks like your likelihood is a kind of mix between the multinomial and ordered. In the multinomial likelihood, you would have different coefficients for all variables across categories of the outcome, but all coefficients for the reference group would be set to zero. For the ordinal logit likelihood, you would as you have done here - all coefficients are constrained to be the same across categories of the outcome, except for the intercepts. However, the intercepts are forced to be ordered and the probabilities that are calculated from the ordered cut points are cumulative (rather than being for each category). Since I'm not sure what you intend, let's look at both. First, the multinomial likelihood:
Note here that we change
logit(p[i,k])tolog(q[i,k])as the probability of observation i, being in category k isWe set the first category coefficients to zero as the reference.
The ordinal model likelihood would look like this:
Here, the alpha parameters are the cut points - assuming that alpha0 = -Inf and alpha3 = Inf. The important point is that the alpha parameters must be increasing in size.
Let's try to run both models:
Multinomial Model
Model Summary:
Diagnostics:
Ordered Logit Model
Model Summary:
Diagnostics:
Created on 2023-04-09 with reprex v2.0.2
Note, in the above code I increased the precision on the coefficients to get something that looked close to convergence, but if you've got lots of data, you should be able to loosen those up in your real model if you like.