Rpart and Tidymodels Integration

126 Views Asked by At

I am trying to replicate a code I have using the rpart package to adjust a poisson decision tree, to use it with tidymodels.

Code Using rpart:

features <- c('is_renewed', 'gender', 'age_group', 'cluster_k3', 'insured_amount')

dt_freq <- rpart(
  formula = as.formula(paste('cbind(exposure, n_claims) ~', paste(features, collapse = ' + '))),
  data    = dat_train,
  method  = "poisson",
  control = list(minsplit = 16, maxdepth = 3, cp = 0.00025))

Code Using Tidymodels:

dt_spec <-
  decision_tree(
    cost_complexity = 0.00025,
    tree_depth = 3,
    min_n = 16) |>
  set_engine("rpart", method = "poisson") |>
  set_mode("regression")

fit_dt <-
  parsnip::fit.model_spec(
    object = dt_spec,
    formula = as.formula(paste('cbind(exposure, n_claims) ~', paste(features, collapse = ' + '))),
    data    = dat_train)

With both codes I obtain the same fit of the decision tree and it generates exactly the expected results. However, the values of the hyperparameters were previously found. Now, trying to replicate the tuning process with tidymodels, I get errors with respect to the formula specification.

Code Using Tidymodels for Tune Decision Tree:

tree_spec <- decision_tree(
  cost_complexity = tune(),
  tree_depth = tune(),
  min_n = tune()) %>%
  set_engine("rpart", method = "poisson") %>%
  set_mode("regression")

tree_wf <- workflow() %>%
  add_model(tree_spec) %>%
  add_formula(as.formula(paste('cbind(exposure, n_claims) ~', paste(features, collapse = ' + '))))

tree_grid <- grid_latin_hypercube(x = extract_parameter_set_dials(x = tree_spec), size = 50)

doParallel::registerDoParallel()

set.seed(345)
tree_rs <-
  tree_wf %>%
  tune_grid(
  resamples = folds,
  grid = tree_grid,
  metrics = metric_set(yardstick::rmse)
)

Errors: Warning message: All models failed. Run show_notes(.Last.tune.result) for more information.

"Error in str2lang(x): :1:32: unexpected symbol\n1: cbind(cbind(exposure, n_claims).exposure\n ^"

enter image description here

0

There are 0 best solutions below