I am trying to supply a vector that contains multiple column names to a mutate() call using the dplyr package. Reproducible example below:
stackdf <- data.frame(jack = c(1,NA,2,NA,3,NA,4,NA,5,NA),
jill = c(1,2,NA,3,4,NA,5,6,NA,7),
jane = c(1,2,3,4,5,6,NA,NA,NA,NA))
two_names <- c('jack','jill')
one_name <- c('jack')
# jack jill jane
# 1 1 1
# NA 2 2
# 2 NA 3
# NA 3 4
# 3 4 5
# NA NA 6
# 4 5 NA
# NA 6 NA
# 5 NA NA
# NA 7 NA
I am able to figure out how to use the "one variable" versions, but do not know how to extend this to multiple variables?
# the below works as expected, and is an example of the output I desire
stackdf %>% rowwise %>% mutate(test = anyNA(c(jack,jill)))
# A tibble: 10 x 4
jack jill jane test
<dbl> <dbl> <dbl> <lgl>
1 1 1 1 FALSE
2 NA 2 2 TRUE
3 2 NA 3 TRUE
4 NA 3 4 TRUE
5 3 4 5 FALSE
6 NA NA 6 TRUE
7 4 5 NA FALSE
8 NA 6 NA TRUE
9 5 NA NA TRUE
10 NA 7 NA TRUE
# using the one_name variable works if I evaluate it and then convert to
# a name before unquoting it
stackdf %>% rowwise %>% mutate(test = anyNA(!!as.name(eval(one_name))))
# A tibble: 10 x 4
jack jill jane test
<dbl> <dbl> <dbl> <lgl>
1 1 1 1 FALSE
2 NA 2 2 TRUE
3 2 NA 3 FALSE
4 NA 3 4 TRUE
5 3 4 5 FALSE
6 NA NA 6 TRUE
7 4 5 NA FALSE
8 NA 6 NA TRUE
9 5 NA NA FALSE
10 NA 7 NA TRUE
How can I extend the above approach so that I could use the two_names vector? Using as.name only takes a single object so it does not work.
This question here is similar: Pass a vector of variable names to arrange() in dplyr. That solution "works" in that I can use the below code:
two_names2 <- quos(c(jack, jill))
stackdf %>% rowwise %>% mutate(test = anyNA(!!!two_names2))
But it defeats the purpose if I have to type c(jack, jill) directly rather than using the two_names variable. Is there some similar procedure where I can use two_names directly? This answer How to pass a named vector to dplyr::select using quosures? uses rlang::syms but though this works for selecting variables (ie stackdf %>% select(!!! rlang::syms(two_names)) it does not seem to work for supplying arguments when mutating (ie stackdf %>% rowwise %>% mutate(test = anyNA(!!! rlang::syms(two_names))). This answer is similar but does not work: How to evaluate a constructed string with non-standard evaluation using dplyr?
There are several keys to solving this question:
dplyrmutate, here theanyNAThe goal here is to replicate this call, but using the named variable
two_namesinstead of manually typing outc(jack,jill).1. Using dynamic variables with dplyr
Using
quo/quos: Does not accept strings as input. The solution using this method would be:Note that
quotakes a single argument, and thus is unquoted using!!, and for multiple arguments you can usequosand!!!respectively. This is not desirable because I do not usetwo_namesand instead have to type out the columns I wish to use.Using
as.nameorrlang::sym/rlang::syms:as.nameandsymtake only a single input, howeversymswill take multiple and return a list of symbolic objects as output.Note that
as.nameignores everything after the first element. However,symsappears to work appropriately here, so now we need to use this within themutatecall.2. Using dynamic variables within
mutateusinganyNAor other variablesUsing
symsandanyNAdirectly does not actually produce the correct result.Inspection of the
testshows that this is only taking into account the first element, and ignoring the second element. However, if I use a different function, egsumorpaste0, it is clear that both elements are being used:The reason for this becomes clear when you look at the arguments for
anyNAvssum.anyNAexpects a single objectx, whereassumcan take a variable list of objects(...).Simply supplying
c()fixes this problem (see answer from alistaire).Alternately... for educational purposes, one could use a combination of
sapply,any, andanyNAto produce the correct result. Here we uselistso that the results are provided as a single list object.Supplying
listfixes this problem because it binds all the results into a single object.Understanding why these two perform differently make sense when their behavior is compared!