Passing a class(column) value as a string to process imbalance data(caret::ROSE)

20 Views Asked by At

I am trying to pass a column name as a string to an R function that should take the class name, apply the ROSE method to balance classes and return the balanced data frame. I have tried variations of rlang functions, but nothing has worked yet. I would appreciate it if anyone could guide me in resolving this error.

Thanks :)

process_ROSE_imbalanced_data <- function(class_var,train_input){
    class_var <- rlang::enquo(class_var)
    rose_model <- ROSE(!!class_var ~., data=train_input)$data
    return(rose_model)
}

set.seed(922)

data_for_ml_analysis <- wider_ml_data %>%
    mutate(id = row_number()) %>%
    na.omit() %>%
    dplyr::mutate(cancer_class = if_else(cancer_class == 0, 'Healthy','Cancer')) %>%
    dplyr::mutate(cancer_class = as.factor(cancer_class)) 



 train_frame <- data_for_ml_analysis %>% 
     sample_frac(.80) %>%
      as.data.frame()


   balanced_train <- process_ROSE_imbalance('cancer_class', train_frame)

   ## Error in eval(predvars, data, env) : object 'class_var' not found

   table(train_frame$cancer_class)

     Cancer         Healthy
      101           195 
1

There are 1 best solutions below

0
I_O On BEST ANSWER

how about composing the model formula from string fragments (one of which is the dependent class_var:

process_ROSE_imbalanced_data <- function(class_var, train_input){
  ROSE(sprintf('%s ~ .', class_var) |> as.formula(), ## derive formula from string
       data = train_input
       )$data
}