I'm pooling estimates from a raking weighting procedure on multiple imputed data and I'm interested in describing how the variability from the imputation by chained equation carry over my estimates. I want to do this by reporting U-hat (i.e.: the average within variance of the estimate), B (the in-between variance or the extra variance caused by the fact that there is missing data in the sample) and λ (i.e.: the proportion of total variance due to missingness or the ratio of the B on the total variance).
These parameters are usually easily accessible with the pool() function from mice package. However, I'm using MIcombine because it is more convenient if we want to apply raking procedure to multiple imputed data.
So my question is : how can I get the within and in-between variances and the lambda ratio from MIcombine?
Here is a reproducible code.
library(mitools)
library(survey)
library(mice)
data(nhanes)
nhanes2$hyp <- as.factor(nhanes2$hyp)
imp <- mice(nhanes2,method=c("polyreg","pmm","logreg","pmm"), seed = 23109)
imp_list <- lapply( 1:5 , function( n ) complete( imp , action = n ) )
des<-svydesign(id=~1, data=imputationList(imp_list))
small.svy.rake<-des
age.dist <- data.frame(age = c("20-39","40-59", "60-99"),
Freq = nrow(des) * c(0.5, 0.3, .2))
# loop through each of the implicates
# applying the `rake` function to each
small.svy.rake$designs <-
lapply(
des$designs ,
rake ,
sample.margins = list(~age),
population.margins = list(age.dist)
)
# new mean bmi estimated after raking with pooled standard error
summary(MIcombine( with( small.svy.rake , svymean( ~ bmi ) ) ))
Within standard error can be extract from the list of raking operations on the imputed data
rake.list<-with( des , svymean( ~ bmi ) )#create a list object
datalong<-do.call(rbind, lapply(rake.list , function(x) data.frame(x)))#transform into long format data frame
mean(datalong[,2]) #mean standard error=within standard error
However, I don't know how to obtain the two other components, i.e.: the in-between variance and the lambda parameter and would enjoy any help to get closer to that goal.
Following @Thomas Lumley comment, here are the adjustments to print the in-between variance and the lambda ratio:
In-between variance is computed directly by the function so we retain
evaras B in the output. For the proportion of variance due to missingness, the function computes the ratior, which is the relative increase in variance due to nonresponse. Following Van Buuren, it is linked tolambdaby r=λ/(1−λ), so can be solved forlambdaby λ=r/(r+1).