Error when performing HPO with Hyperband and MBO with helper function auto_tuner() but not with tune() on #mlr3

45 Views Asked by At

I have a dataset containing 100 features, which I want to analyze using mlr3. I want to use XGBoost as a learner and Hyperband or MBO as tuners. However, I run into errors when using the helper function auto_tuner(), but not when using tune()

learner <-
  po("encode", method = "treatment", affect_columns = selector_type("factor")) %>>%
  po("scale") %>>%
  po("learner",
      lrn("classif.xgboost", predict_type = "prob",
          nrounds           = to_tune(p_int(1, 5000, tags="budget")),
          eta               = to_tune(1e-4, 1, logscale = TRUE),
          max_depth         = to_tune(1, 20),
          colsample_bytree  = to_tune(1e-1, 1),
          colsample_bylevel = to_tune(1e-1, 1),
          lambda            = to_tune(1e-3, 1e3, logscale = TRUE),
          alpha             = to_tune(1e-3, 1e3, logscale = TRUE),
          subsample         = to_tune(1e-1, 1)))

at <- auto_tuner(                                                   
  tuner = tnr("hyperband", eta = 2),
  learner = learner,
  resampling = rsmp("cv", folds=2),
  measure = msr("classif.auc"),
  terminator = trm("none"),
  store_models = TRUE,
  evaluate_default = TRUE)

at$train(task_US)
Error in .__OptimInstance__eval_batch(self = self, private = private,  : 
  Assertion on 'colnames(xdt)' failed: Names must include the elements {'classif.xgboost.nrounds','classif.xgboost.eta','classif.xgboost.max_depth','classif.xgboost.colsample_bytree','classif.xgboost.colsample_bylevel','classif.xgboost.lambda','classif.xgboost.alpha','classif.xgboost.subsample'}, but is missing elements {'classif.xgboost.nrounds'}.

When I use tune(), the error doesn't come up. I've tried with

tuner = tnr("mbo")

and the same error comes up.

1

There are 1 best solutions below

0
Sebastian On

Thanks a lot for bringing up this to our attention, this is simply a bug! The culprit here is the line evaluate_default = TRUE, if you disable it, everything works!

I have created an issue for you here: https://github.com/mlr-org/mlr3tuning/issues/406

library(mlr3verse)
#> Loading required package: mlr3

lgr::get_logger("bbotk")$set_threshold("warn")
lgr::get_logger("mlr3")$set_threshold("warn")

learner <-
  po("encode", method = "treatment", affect_columns = selector_type("factor")) %>>%
  po("scale") %>>%
  po("learner",
      lrn("classif.xgboost", predict_type = "prob",
          nrounds           = to_tune(p_int(1, 20, tags="budget")),
          eta               = to_tune(1e-4, 1, logscale = TRUE),
          max_depth         = to_tune(1, 20),
          colsample_bytree  = to_tune(1e-1, 1),
          colsample_bylevel = to_tune(1e-1, 1),
          lambda            = to_tune(1e-3, 1e3, logscale = TRUE),
          alpha             = to_tune(1e-3, 1e3, logscale = TRUE),
          subsample         = to_tune(1e-1, 1)))

at <- auto_tuner(
  tuner = tnr("hyperband", eta = 2),
  learner = learner,
  resampling = rsmp("cv", folds=2),
  measure = msr("classif.auc"),
  term_evals = 2,
  store_models = TRUE,
  evaluate_default = FALSE)

at$train(tsk("sonar"))

Created on 2024-01-25 with reprex v2.0.2