I am trying to perform chisquare tests on about 30 variables. I tried to write a for loop to no luck. The loop should also save the p-value of each test.
I have used this kind of setup before, in other instances, but I recognise that using a vector with the dataframe$variable names does not work on this occasion. I suspect that there is something fundamental I do not understand about the translation from text to variable name.
Example:
survey <- data.frame(
sex = c(1, 2, 2, 1, 1, 2, 1, 1, 2, 1),
health = c(1, 2, 3, 4, 5, 1, 3, 2, 4, 5),
happiness = c(1, 3, 4, 5, 1, 2, 4, 2, 3, 5)
)
variables <- c("survey$health", "data$happiness")
nLoops <- length(variables)
result <- matrix(nrow = nLoops, ncol = 2)
for (i in 1:nLoops){
test <- chisq.test(variables[i], survey$sex)
result[, 1] <- test$data.name
result[, 2] <- test$p.value
}
A base R solution, changing your for loop to an
lapplycall:The warning messages are caused by your small sample size.
A
tidyversesolution, which I believe is more robust as it is independent of the names of the columns you wish to analyse. It can easily be generalised to be robust with respect to your grouping variable as well.Results are identical to the above.