Referential transparency in dplyr::filter: making column name variable

103 Views Asked by At

Core question (what it seems to boil down to)

How do I construct a call to rlang::quo with the "left" instead of the "right" side of the expression being referentially transparent

Taken from the help page of rlang::quo, this works

quo(foo(!! quo(bar)))
# <quosure: global>
# ~foo(~bar)

while this doesn't:

quo(!! quo(foo)(bar))
# Error in (function (x)  : attempt to apply non-function

Question put into a bit more context

dplyr::mutate allows "both sides of an expression" to be variable in the sense that both expression parts can be made referentially transparent (see vignette):

library(dplyr)
set.seed(89234)
df <- data.frame(id = rep(1:3, 3), value = rpois(9, 10))

c_id <- as.name("id")
c_value <- as.name("value")
# NOTE: in our prototyping, actual columns names are often subject to
# change (e.g. `id` might become `id_global`), thus I would like to stay 
# as flexible as possible in all of my subsequent `dplyr` calls.

my_multiply <- function(x, by) x * by

df %>% mutate(!!c_value := my_multiply(!!c_value, 10))
#   id value
# 1  1    70
# 2  2    90
# 3  3   130
# 4  1    80
# 5  2    80
# 6  3   120
# 7  1   140
# 8  2   120
# 9  3   110

How can I realize the same/something similar in dplyr::filter the focus being able to make the column name ("left side") referentially transparten/flexible.

I would like to ideally end up with something like this (pseudo code):

v_id <- 1
df %>% filter(!!c_id :== v_id)

What I tried

I'm aware that dplyr::filter differs to dplyr::mutate regarding the type of expressions they expect. So based on the vignette I came up with this version where the entire expression to be evaluated is passed as an argument:

my_filter <- function(x, expr) {
  quo_expr <- enquo(expr)
  print(quo_expr)
  x %>% filter(!!quo_expr)
}
v_id <- 1
my_filter(df, id == v_id)
# <quosure: global>
# ~id == v_id
#   id value
# 1  1     7
# 2  1     8
# 3  1    14

However, that "forces" me to really use the actual column name while I would like to use the reference c_id:

my_filter(df, c_id == v_id)
# <quosure: global>
# ~c_id == v_id
# [1] id    value
# <0 rows> (or 0-length row.names)

I'm basically at loss of how to construct a call to dplyr::quo or dplyr::enquo where the left part contains the evaluated reference of the column name while the right part contains the **non-evaluated* reference of the logical query to be evaluated:

my_filter <- function(x, left, right) {
  quo_expr <- quo(quo(!!left) == right)
  print(quo_expr)
  x %>% filter(!!quo_expr)
}
my_filter(df, c_id, v_id)

# <quosure: frame>
# ~quo(id) >= right
# [1] id    value
# <0 rows> (or 0-length row.names)

Put another way, I think the quosure should end up being ~id == right and I don't know how to do that

1

There are 1 best solutions below

0
Rappster On

With some help form another community I was able to piece it together to arrive at a much simpler solution:

df %>% filter((!! c_id) == v_id)
#   id value
# 1  1     7
# 2  1     8
# 3  1    14

So it simply comes down to wrapping the call to !! with parentheses!

Using !! without parentheses would not work because ! has low operator precedence so it basically captures everything to its right and thus complains:

df %>% filter(!! c_id == v_id)
# [1] id    value
# <0 rows> (or 0-length row.names)

What's going on here behind the scenes is that the logical operation actually being carried out is ! (!c_id == v_id). Since !c_id == v_id is TRUE the entire expression returns FALSE. Thus, we're actually running `df %>% filter(FALSE) which is clearly not what we wanted ;-)