I'm trying to convert an lmer R code to python and I'm having a hard time getting the same results. The data has 2 random effects
Here is the R code:
library(openxlsx)
library(lme4)
xx <- read.xlsx("test_data.xlsx")
print(summary(lmer(Val ~ (1 | R1) + (1 | R2) + (1 | R1:R2), data = xx)))
the output of this is
Linear mixed model fit by REML ['lmerMod']
Formula: Val ~ (1 | R1) + (1 | R2) + (1 | R1:R2)
Data: xx
REML criterion at convergence: 184.3
Scaled residuals:
Min 1Q Median 3Q Max
-1.52934 -0.77882 -0.09979 0.69857 2.32918
Random effects:
Groups Name Variance Std.Dev.
R1:R2 (Intercept) 0.0 0.000
R1 (Intercept) 733.9 27.090
R2 (Intercept) 0.0 0.000
Residual 19.9 4.461
Number of obs: 30, groups: R1:R2, 6; R1, 3; R2, 2
Fixed effects:
Estimate Std. Error t value
(Intercept) 122.17 15.66 7.8
for python equivalent I'm using statsmodels.
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
xx = pd.read_excel("test_data.xlsx")
xx['group'] = 1
formula = "Val ~ 1"
vcf = {"R1": "0 + C(R1)", "R2": "0 + C(R2)"}
model = sm.MixedLM.from_formula(formula, data=xx, groups=xx["group"], vc_formula=vcf,re_formula="~1+R1:R2")
result = model.fit()
# Print the summary of the model
print(result.summary())
The output for this is
Mixed Linear Model Regression Results
========================================================
Model: MixedLM Dependent Variable: Val
No. Observations: 30 Method: REML
No. Groups: 1 Scale: 19.6936
Min. group size: 30 Log-Likelihood: -92.4598
Max. group size: 30 Converged: No
Mean group size: 30.0
--------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
--------------------------------------------------------
Intercept 121.678
Group Var 19.694
Group x R1:R2 Cov 0.000
R1:R2 Var 0.578
R1 Var 789.843 201.169
R2 Var 0.217
========================================================
The intercept and the var value are not similar. Is there a way to get the exact same output? how should the python vc_formula and re_formula be set to be equivalent to the R formula?
test_data.xlsx:
R1 R2 Val
1 1 123
1 1 114
1 1 119
1 1 119
1 1 113
1 2 122
1 2 115
1 2 119
1 2 129
1 2 113
2 1 95
2 1 96
2 1 100
2 1 94
2 1 99
2 2 95
2 2 93
2 2 102
2 2 96
2 2 100
3 1 151
3 1 144
3 1 149
3 1 154
3 1 155
3 2 147
3 2 145
3 2 149
3 2 157
3 2 158