Tune GAMs based on multiple formulas using the mlr3 package

99 Views Asked by At

I would like to tune a Generalized Additive Model (GAM) based on several formulas associated with different combinations of k (i.e., dimension of the basis used to represent the smooth term). I am using a grid search to accomplish this. However, I encountered the following error message:

Error: 'TunerGridSearch' does not support param types: 'ParamUty'

How can I test several formulas to tune GAMs ? I'm sorry, I'm new to machine learning models.

Here is a reproducible example using the mlr3 package:

UPDATE:

## Install the learner "classif.gam"
## Useful link: https://github.com/mlr-org/mlr3extralearners
## https://mlr3extralearners.mlr-org.com/reference/mlr_learners_classif.gam.html
remotes::install_github("mlr-org/mlr3extralearners@*release")
install_learners("classif.gam")

## Task
task_sonar = tsk("sonar")
## summary(task_sonar)

## Search space
search_space <- paradox::ps(formula = paradox::p_uty(c("Class ~ s(V15, k = 1)", "Class ~ s(V15, k = 2)")))

## Learner
learner <- mlr3extralearners::lrn("classif.gam", predict_type = "prob")

## Performance measure
measure <- mlr3::msr("classif.auc")

## Terminator
terminator <- mlr3tuning::trm("none")

## Tuner
tuner <- mlr3tuning::tnr("grid_search")

## Resampling
resampling <- rsmp ("cv", folds = 5)
inner_resampling <- rsmp ("cv", folds = 5)
outer_resampling <- rsmp ("cv", folds = 5)

## Run an automatic tuning process
at = mlr3tuning::auto_tuner(tuner = tuner,
                            learner = learner,
                            resampling = resampling,
                            measure = measure,
                            search_space = search_space,
                            terminator = terminator)

at$train(task_sonar)
1

There are 1 best solutions below

3
DuesserBaest On

Disclaimer: I do not have access to mlr3extralearners since it is not in CRAN. I used kknn instead since it also has a hyperparameter k

I think the error comes from defining the formula inside the search_space.

To obtain the formula (Class ~ V15) subset the dataset or use a suitable pipe operator. Then define the tuning range for k as a p_int() (See the mlr3 book on this).

Code:

library(mlr3verse)
library(tidyverse)

## take the data and reduce it to the variables of interest
df <- tsk("sonar")$data()
df_red <- df %>% select("Class", "V15")

## define the task (Class ~ V15)
task_sonar <- as_task_classif(
  df_red,
  target="Class",
  id = "sonar"
)

learner <- lrn("classif.kknn", predict_type="prob")

## create searchspace independently of task
search_space <- ps(
  k=p_int(1,2)
)

## Performance measure
measure <- mlr3::msr("classif.auc")

## Terminator
terminator <- mlr3tuning::trm("none")

## Tuner
tuner <- mlr3tuning::tnr("grid_search")

## Resampling
resampling <- rsmp ("cv", folds = 5)
inner_resampling <- rsmp ("cv", folds = 5)
outer_resampling <- rsmp ("cv", folds = 5)

## Run an automatic tuning process
at = mlr3tuning::auto_tuner(tuner = tuner,
                            learner = learner,
                            resampling = resampling,
                            measure = measure,
                            search_space = search_space,
                            terminator = terminator)

at$train(task_sonar)