How to keep variables in the model output of lm() or glmmTMB() that are not used as factors in the function call itself in R?

33 Views Asked by Annick At 27 June 2023 at 11:50

This question has been asked before here: How to keep a variable in fit$model for lm() in R that I'm *not* using within the lm call itself?

But I'm looking for a more general answer, as in my case the input data frame has many variables that I want to retain without using them as factors in the model, and also the variable values are not unique (and so cannot be used as row names).

So my example data might look like this:

df <- data.frame(a = c(1,2,3,4,5,6,1,2,3), 
                 b = c("A", "B", "C","A", "B", "C","A", "B", "C" ), 
                 country = c("Malawi", "Malawi","UK", "Malawi"),
                 Solvent_ref = c("DMSO", "DMSO", "H2O")
)

(note that all cases with a given value for b will have the same values in the variables I want to "carry over" but not use in the model, i.e. in the example above, country and Solvent_ref

If I then run

library(glmmTMB)
library(emmeans)

model = glmmTMB(a~b, data = df)

emmean_df = as.data.frame(emmeans(model,
                                  type = "response",
                                  specs = ~ b))

the resulting emmean_df has lost the variables country and Solvent_ref:

> emmean_df
  b   emmean        SE df    lower.CL upper.CL
1 A 1.999998 0.8164967  5 -0.09887403 4.098869
2 B 2.999995 0.8164967  5  0.90112308 5.098866
3 C 4.000004 0.8164967  5  1.90113222 6.098876

The output I'd like to see would be:

  b   emmean        SE df    lower.CL upper.CL country Solvent_ref
1 A 1.999998 0.8164967  5 -0.09887403 4.098869  Malawi        DMSO
2 B 2.999995 0.8164967  5  0.90112308 5.098866  Malawi        DMSO
3 C 4.000004 0.8164967  5  1.90113222 6.098876      UK         H2O

One solution I can see would be to use a left_join to re-annotate the emmean data that comes out of the model with the 'lost variables', but is there a way to "carry them over" from the original data frame instead?

df_summary = df %>%
  group_by(b) %>%
  summarise(
    country = unique(country),
    Solvent_ref = unique(Solvent_ref)
  )

emmean_df = emmean_df %>%
  left_join(df_summary)

Original Q&A

How to keep variables in the model output of lm() or glmmTMB() that are not used as factors in the function call itself in R?

There are 0 best solutions below

Related Questions in R

Related Questions in VARIABLES

Related Questions in FACTORS

Related Questions in GLMMTMB

Trending Questions

Popular # Hahtags

Popular Questions