How to succesfully convert GenStat Multiple Experiments REML to R lme() code?

17 Views Asked by At

I performed a multiple experiments REML (where for each experiment separate residuals are calculated) in GenStat (version 22.1.0.195) and would like to be able to perform the same analysis in R (version 4.3.1). I am very close to converting the code, but can't seem to find why there are still small differences in F statistics and p-values (but not in the LS/EM means). My experiments are Year.Period combination (which I convert into 1 factor first). The fixed effects in the model are Treatment, Period, Year and the interaction between Treatment and Period, and the random effects are Block nested within Year.Period and Year.Period.

This is my GenStat code which I want to convert:

  FACPRODUCT !P(Year,Period); YearPeriod

  DELETE [REDEFINE=yes] _remlsave
  VCOMPONENTS [FIXED=Treatment+Period+Year+Treatment.Period; FACTORIAL=32; EXPERIMENTS=YearPeriod]\
  RANDOM=YearPeriod/Block;\
  CONSTRAINTS=pos
  REML [PRINT=model,components,means,waldTests; MAXCYCLE=30; FMETHOD=automatic; PSE=ALLESTIMATES;\
  MVINCLUDE=*; METHOD=AI] Y; SAVE=_remlsave

This is the R code I got so far:

data2 <- data1 %>% 
  mutate(YearPeriod = as.factor(interaction(Year, Period)))

model6 <- lme(Y ~ Treatment * Period + Year, 
              random = ~1 | YearPeriod/Block,
              weights = varIdent(form = ~1 | YearPeriod),
              data = data2,
              na.action = na.omit)

summary(model6)
anova(model6)
emmeans(model6, ~ Treatment*Period)

I used varIdent to calculate separate residuals for each YearPeriod combination just like in GenStat. That seems to work fine (except the parameter estimates are a bit different, do they estimate these differently?). Now the emmeans provides the same output as the estimates in GenStat, but there seem to be small differences in the F-statistics and p-values. For example, the GenStat output is:

Tests for fixed effects
Sequentially adding terms to fixed model
 
Fixed term  Wald statistic  n.d.f.  F statistic d.d.f.  F pr
Treatment         189.69       2    94.84   238.3    <0.001
Period              4.84       2    1.62    2.0       0.380
Year                0.09       1    0.09    2.0       0.795
Treatment.Period   21.43       4    5.33    175.8    <0.001

And the R output is:

> anova(model6)
               numDF denDF   F-value p-value
(Intercept)        1   167 2215.1035  <.0001
Treatment          2   167   94.8582  <.0001
Period             2     2    2.4222  0.2922
Year               1     2    0.0874  0.7954
Treatment:Period   4   167    5.3577  0.0004

As you can see, the results are almost equal, but why is the p-value for Period different? I feel like I am missing one of the default settings for one of the codes, but I can't seem to figure it out.

So, my question is, what is the difference between these codes? What do I need to add in the R code to provide the same outcome?

I hope someone can help me out, thanks!

0

There are 0 best solutions below