I'm working on the Kaggle Titanic dataset. I'm trying to use LightGBM's LGBMClassifier to determine if a given passenger survived. I've created a pipeline for filling in and processing all of the data and am trying to use BayesSearchCV to optimize my LightGBM hyperparameters. I recieve the following error when I use BayesSearchCV:
"TypeError: '<' not supported between instances of 'Version' and 'tuple'"
I have no idea why I'm getting this error since I can fit the pipleine I created to the data and it works with Sklearn's GridSearchCV so I don't know if this is an issue with BayesSearchCV or it's just me. I've put my pipeline and the code that runs the error below with a mark on where the error occurs.
target = 'survived'
categorical_features = ['sex',
#'ticket',
#'cabin',
'embarked']
numeric_features = ['pclass',
'age',
'sibsp',
'parch',
'fare']
train, test = train_test_split(df,test_size=0.20)
numerical_pipe = Pipeline([('imputer', SimpleImputer(strategy = 'mean'))])
categorical_pipe = Pipeline([('imputer', SimpleImputer(strategy = 'constant', fill_value = 'missing')),
('onehot', OneHotEncoder(handle_unknown = 'ignore'))])
preprocessing = ColumnTransformer(transformers = [
('cat', categorical_pipe, categorical_features),
('num', numerical_pipe, numeric_features)])
lgb_pipe = Pipeline([
('preprocess', preprocessing),
('classifier', LGBMClassifier())])
search_space_lgb = {'num_leaves': Integer(1, 500),
'max_depth': Integer(1, 500)}
bayes_search_lgb = BayesSearchCV(lgb_pipe,
search_space_lgb)
bs_lgb = bayes_search_lgb.fit(train[numeric_features + categorical_features],
train[target]) #ERROR HERE
print(bs_lgb.best_params_)
This is an extra part of the error I think is useful in identifying what exactly is wrong.
/Applications/anaconda3/lib/python3.7/site-packages/skopt/space/space.py in rvs(self, n_samples, random_state)
762
763 for dim in self.dimensions:
--> 764 if sp_version < (0, 16):
765 columns.append(dim.rvs(n_samples=n_samples))
766 else:
Found another stackoverflow with the same error as I have (TypeError inside the `scikit-optimize` package) but none of the solutions work for me.
The BayesSearchCV come from the skopt library. You don't have to import the whole library you can just type this following command:
from skopt import BayesSearchCV
Also, you might want to use shift+tab to double check on the parameters requirements. Most errors happen at the parameters' level. Let me know if that solve the issue.