Transformation of data during cross validation

77 Views Asked by At

I am using a H20GeneralizedLinearEstimator in h2o.ai.

I am planning to use the cross validation built-in option to get cross validated performances. Before fitting the model, I perform some transformations (scaling and translating mainly) that depend on the data I am applying the transformations to.

Ideally these transformations should be "trained" just on the train set and applied asis on the test data. Therefore, in principle, the same should be done during cross validation: at each cross validation step, the transformation should be trained on the relative train data and applied to test data.

Is it possible to do so in H2O, without having to manually implement a cross validation loop?

Thanks

1

There are 1 best solutions below

2
Erin LeDell On

If you're using the H2O GLM, you don't need to do any scaling to the data because you can do that automatically by setting normalize to True. If there's other transformations you need to do for some reason, then you'd want to set up a manual CV loop, but hopefully you can just use the built-in scaling.