Loop a function that returns a list over variables in a dataframe grouped by another variable

Question

Loop a function that returns a list over variables in a dataframe grouped by another variable

238 Views Asked by jaydoc At 18 May 2025 at 08:07

As an example dataset

example.df <- data.frame( 
species = sample(c("primate", "non-primate"), 50, replace = TRUE),
treated = sample(c("Yes", "No"), 50, replace = TRUE), 
gender = sample(c("male", "female"), 50, replace = TRUE), 
var1 = rnorm(50, 100, 5), var2=rnorm(50, 10,5), var3 = rnorm (50, 25, 5))

I am trying to use pairw.kw from the asbio package to calculate dunn test P-values after grouping by a variable.

by(example.df,example.df$treated, function(X) pairw.kw(X$var1, X$species, conf = 0.95))

returns a valid result.

How can I modify this code (or some other way) to loop over the other numeric variables (I have 23 in my actual dataset)?

Edit: I used the following code to solve my question based on the excellent answers from @jay.sf below.

vars <- colnames(select_if(example.df, is.numeric))
res <- by(example.df, example.df$treated, simplify = FALSE, function(X) sapply(vars, simplify = FALSE, USE.NAMES = TRUE, function(i) pairw.kw(X[[i]], X$species, conf = 0.95)))
res_summary <- res %>% map_depth(2, "summary")
res_summary.df <- do.call(rbind, lapply(sapply(res_summary, `[`, simplify = FALSE, USE.NAMES = TRUE), data.frame))

This converts the summary object that's the only thing I need from res and converts it into a dataframe that is easy to work with.

Original Q&A

There are 1 best solutions below

**jay.sf** · Accepted Answer

You could just build in a sapply() that loops through the various variables. First, we need a character vector that contains the names of the numeric names.

(vars <- names(example.df)[4:6])
# [1] "var1" "var2" "var3"

Now we put that in the by(.)

library("asbio")
res <- by(example.df, example.df$treated, function(X) sapply(vars, function(i)
  pairw.kw(X[[i]], X$species, conf = 0.95)))

Finally you can run str(res) to see what is in the result and how to access it.

E.g.

> res$Yes[[4]]
                                        Diff    Lower   Upper Decision Adj. P-value
Avg.ranknon-primate-Avg.rankprimate -0.19444 -5.55705 5.16817   FTR H0     0.943345

Loop a function that returns a list over variables in a dataframe grouped by another variable

There are 1 best solutions below

Related Questions in R

Related Questions in LIST

Related Questions in LOOPS

Related Questions in ASBIO

Trending Questions

Popular # Hahtags

Popular Questions