I am a student trying to do Latent Transition Analysis/Hidden Markov for the first time for my thesis using LMest in R. I am analyzing various behaviors (binary variables) measured during two time points and interested in how some covariates affect the change between the time points. LTA seems fitting for that but seems to be mostly done using Mplus. I do not have access to Mplus and there seems to be a lack of literature using LMest, so I don’t know how to implement many of the suggestions regarding the process of doing a correct LTA.
Assessing the model fit, multiple sources suggest constraining the model and/or using likelihood ratio tests but in this case, I am not even sure which model I would compare it to. My model has 3 classes, so would I compare it to a model with only one/no class? Or would I use
modBasic = 1(for “time-homogeneous transition matrices” (for constraining transition probabilities across waves to be equal) which is something some literature suggests). Is this what "testing the model for measurement invariance" is? Also there is no option to calculate a likelihood ratio test in LMest, and lmtest package doesn’t work on LMest objects. I tried calculating it by hand but the p-value was 0 which seems wrong to me. Using only BIC/AIC doesn’t seem sufficient to me to assess a model. Further literature (Bartolucci et al., 2009. DOI:10.1214/08-AOAS230, Bartolucci et al., 2014, https://doi.org/10.1007/s11749-014-0381-7) also suggests using R-squared and entropy to assess fit, and while Mplus does seem to have these, LMest doesn't (and also doesn't really mention assessing fit in their R documentation). So how would I assess my model fit in LMest and compare it to a different model?I am including covariates that might affect transition probabilities in my model and I am having trouble understanding how to interpret the output. I copied the output below.
Ga - Parameters affecting the logit for the transition probabilities:
, , logit = 1
logit
2 3
intercept -1.7821 -2.8981
X1gender 0.2112 0.7080
, , logit = 2
logit
2 3
intercept -1.4892 -2.7488
X1gender 0.2025 0.0134
, , logit = 3
logit
2 3
intercept -0.6304 -1.4776
X1gender 0.0084 0.5398
For testing purposes I have only included one covariate so far but the final will include more. I assume that logit = 1, 2, 3 refers to the classes, but in that case why do the results only refer to 2/3 and not 1?
Furthermore, how do I know if the effect of the covariate is significant or not? I assume the shown numbers are log odds since it is a multinomial logit regression? LMest doesn't seem to include further statistics for this either. Using the basic multi logit regression modeling in R I can get further information like the test statistic, p value, confidence intervals, by, for example, using tidy(model), and I can calculate marginal effects easily with other packages. How would I do that in this case since I can't use the standard packages for LMest objects? Am I missing something here or do I have to do everything manually? I am specifically interested in my covariates' effects on the transition probabilities.
Thanks in advance for any help
You are asking three distinct questions, and I will focus on the last two (because I'm not sure what is the best way to perform model checking in a hidden Markov model with binary outcomes).
Transition probability parameters
The reason that 12 parameters are output by LMest is that the rows of the transition probability matrix are constrained to sum to 1. For example, if we know
Pr(1 -> 2)andPr(1 -> 3), we can computePr(1 -> 1) = 1 - Pr(1 -> 2) - Pr(1 -> 3). This means that, in a model with k states, LMest only needs to estimate k*(k-1) transition probabilities. In your case, k = 3, so there are 6 transition probabilities to estimate. For each one, LMest estimates an intercept and the slope for the covariate you included (X1Gender). If you include more covariates, there will be more parameters.Standard errors
To assess whether there seems to be a clear association between a covariate and a transition probability (i.e., "statistical significance"), one option would be to use the standard errors computed by LMest. If you specify the argument
out_se = TRUEinlmest(), then the model object will have a term namedseGa. This gives you the standard error for each parameter inGa. Using the large-sample approximate normality of maximum likelihood estimators, you could then obtain 95% confidence intervals using something likewhere
modis the output oflmest(). You could then check whether the confidence intervals overlap zero for the parameter of interest.