lapply over both group and outcome variable lists in svyby (R)

23 Views Asked by At

I have code to lapply a svyby function from the survey package in r over multiple variables to generate group means and confidence intervals for each variable. My code looks like the following:

library(survey)
library(tidyverse)

dem_design <- svydesign(
  ids = ~1, 
  data = na.omit(dplyr::select(data,dv1,dv2,group1,iv1,iv2)) 
) 

dv_variables = c("dv1","dv2")
iv_variables = c("iv1","iv2")

df <- lapply(dv_variables, function(x) svyby(as.formula(paste0("~",x)),
                     by = ~group1 + iv1, 
                     design = dem_design, 
                     FUN = svymean,
                     keep.names = FALSE, vartype = "ci")) %>% bind_rows() %>%  pivot_longer(!c(iv1,group1,ci_l,ci_u),names_to = "dv",values_to = "mean",values_drop_na = TRUE)

However, I would now like to modify this code, so that I lapply the iv_variables variables in the by function. So I would therefore get one final df with group means for by group1 + iv1 and group1 + iv2, calculated separately.

I tried the following:

df <- lapply(dv_variables, function(x) lapply(iv_variables, function(k)svyby(as.formula(paste0("~",k)),
                  by = ~ as.formula(paste0(k," + group1")), 
              design = dem_design, 
                  FUN = svymean,
                keep.names = FALSE, vartype = "ci"))) %>% bind_rows() %>%  pivot_longer(!c(group1,ci_l,ci_u),names_to = "dv",values_to = "mean",values_drop_na = TRUE)

However, this gave me the following error:

Error in class(ff) <- "formula" : attempt to set an attribute on NULL

How can I lapply over both lists to calculate the group means for both and store them in one dataframe?

1

There are 1 best solutions below

0
promicrobial On

I'm not familiar with the survey package and without a reproducible example to go by it's hard to say this would fit your purposes but have you tried using Map()? It would allow you to cycle through two variables using the following syntax:

Map(function(x, y) {

    paste(x,y)

}, x_variables, y_variables)

I don't know if this is maybe the source of the problem, but I think there's a syntax error in your second lapply code. group1 is a variable in your data table, so shouldn't be in quotation marks in the by argument. And I think that the first formula argument should call x rather than k. Here's my suggested modification:

lapply(dv_variables, function(x) 
    lapply(iv_variables, function(k) 

        svyby(as.formula(paste0("~",x)),
             by = as.formula(paste0("~", k, "+", group1)), 
             design = dem_design, 
             FUN = svymean,
             keep.names = FALSE, vartype = "ci"))) %>% 

bind_rows() %>%  
pivot_longer(!c(ci_l,ci_u),names_to = "dv",values_to = "mean",values_drop_na = TRUE)

I hope this helps in some way! If not and you're able to provide an example of your data, maybe I can help figure it out.