Catboost none features lead to unstable predictions

20 Views Asked by Artem Solodukhov At 01 March 2024 at 09:06

I wanted to ask how CatBoost handles missing categorical features. I am predicting clicks on websites, and I have a very strong bias (avg_target on the train is 0.0004, on unseen features it's 0.002, which is an order of magnitude off) on samples where there are missing features or feature values that were not encountered in the training set. Ideally, trees should send none or unseen to both leaves and average the predictions at the end, or they are internally target-encoded with the average, but it seems that neither of these is happening because otherwise, there would not be such a strong bias. Is there a parameter in CatBoost that can adjust this?

mapped unseed values as nones changed parameters the only thing stabilized predictions was isertion of special labels into train to map unseen values in test but that is a trick that should be in the lib

Original Q&A

Catboost none features lead to unstable predictions

There are 0 best solutions below

Related Questions in CATBOOST

Trending Questions

Popular # Hahtags

Popular Questions