BayesSearchCV TypeError: '<' not supported between instances of 'Version' and 'tuple'

1k Views Asked by At

I'm working on the Kaggle Titanic dataset. I'm trying to use LightGBM's LGBMClassifier to determine if a given passenger survived. I've created a pipeline for filling in and processing all of the data and am trying to use BayesSearchCV to optimize my LightGBM hyperparameters. I recieve the following error when I use BayesSearchCV:

"TypeError: '<' not supported between instances of 'Version' and 'tuple'"

I have no idea why I'm getting this error since I can fit the pipleine I created to the data and it works with Sklearn's GridSearchCV so I don't know if this is an issue with BayesSearchCV or it's just me. I've put my pipeline and the code that runs the error below with a mark on where the error occurs.

target = 'survived'

categorical_features = ['sex',
                       #'ticket',
                       #'cabin',
                        'embarked']

numeric_features = ['pclass',
                    'age',
                    'sibsp',
                    'parch',
                    'fare']

train, test = train_test_split(df,test_size=0.20)

numerical_pipe = Pipeline([('imputer', SimpleImputer(strategy = 'mean'))])

categorical_pipe = Pipeline([('imputer', SimpleImputer(strategy = 'constant', fill_value = 'missing')),
                             ('onehot', OneHotEncoder(handle_unknown = 'ignore'))])

preprocessing = ColumnTransformer(transformers = [
                ('cat', categorical_pipe, categorical_features),
                ('num', numerical_pipe, numeric_features)])

lgb_pipe = Pipeline([
          ('preprocess', preprocessing),
          ('classifier', LGBMClassifier())])

search_space_lgb = {'num_leaves': Integer(1, 500),
                    'max_depth': Integer(1, 500)}

bayes_search_lgb = BayesSearchCV(lgb_pipe, 
                                 search_space_lgb)

bs_lgb = bayes_search_lgb.fit(train[numeric_features + categorical_features], 
                              train[target]) #ERROR HERE

print(bs_lgb.best_params_)

This is an extra part of the error I think is useful in identifying what exactly is wrong.

/Applications/anaconda3/lib/python3.7/site-packages/skopt/space/space.py in rvs(self, n_samples, random_state)

762 
763         for dim in self.dimensions:
--> 764             if sp_version < (0, 16):
765                 columns.append(dim.rvs(n_samples=n_samples))
766             else:

Found another stackoverflow with the same error as I have (TypeError inside the `scikit-optimize` package) but none of the solutions work for me.

2

There are 2 best solutions below

1
Stephan On

The BayesSearchCV come from the skopt library. You don't have to import the whole library you can just type this following command:

from skopt import BayesSearchCV

Also, you might want to use shift+tab to double check on the parameters requirements. Most errors happen at the parameters' level. Let me know if that solve the issue.

0
Will On

I've solved changing skopt/space/space.py lines 763-768

 for dim in self.dimensions:
        
        if sp_version < (0, 16):
            columns.append(dim.rvs(n_samples=n_samples))
        else:
            columns.append(dim.rvs(n_samples=n_samples, random_state=rng))

into

for dim in self.dimensions:
        
        try:
            columns.append(dim.rvs(n_samples=n_samples, random_state=rng))
        except:
            columns.append(dim.rvs(n_samples=n_samples))