retrieve patsy's levels and encoding of categorical variables when transforming data to a design matrix

116 Views Asked by At

When there are categorical variables in the formula, then patsy needs the full original dataset to rebuild the category levels and encoding.

After data is transformed to a design matrix, is there a way to retrieve patsy's levels and encoding for that data? I would like to avoid keeping the full dataset around just so that patsy can rebuild the category levels and encoding.

The context is that I'm transforming training data to a design matrix with patsy during model training, and then would like to know the level/encoding to get a model prediction without having to keep the original training data around.

0

There are 0 best solutions below