I was reading a tutorial for tidymodels and came across the following code block:
lr_recipe <-
recipe(children ~ ., data = hotel_other) %>%
step_date(arrival_date) %>%
step_holiday(arrival_date, holidays = holidays) %>%
step_rm(arrival_date) %>%
step_dummy(all_nominal_predictors()) %>%
step_zv(all_predictors()) %>%
step_normalize(all_predictors())
( This is the source of the code: https://www.tidymodels.org/start/case-study/#first-model )
Basically, the code lists a set of pre-processing operations on predictors that are stored in a recipe object. Now, my question arises from the following: first, in step_dummy(all_nominal_predictors()) one-hot encoding is performed on categorical predictors. Then, in a following step, step_normalize(all_predictors()) applies centering and scaling to all predictors (therefore also on the encoded categorical ones. I am used to train models directly with one-hot encoded categorical predictors, without further processing them through a normalizing step.
What is the advantage of normalizing one-hot encoded predictors? Also, how does it affect the interpretability of the model when predictions are done?
Thanks for any clarification.
If the binary variables are the only predictors, the set of predictors is already standardized (to be on the same units/scale) so no need to do anything else.