How exactly do I have to define the pipeline and the GridSearchCV for an unsupervised learning procedure?

33 Views Asked by lena_sr At 28 January 2024 at 11:29

I am trying to build a model with two stages, an unsupervised and a supervised learning process. First, I want to perform a dimension reduction with a kernel PCA and determine the main components. Then I would like to use an XGBoost method to work with a target variable. First, I try to determine the optimal hyperparameters for the KPCA using cross-validation. My initial approach was as follows:

pipeline = Pipeline([('scaler', StandardScaler()),
                     ('kpca', KernelPCA()),
                     ('xgb', XGBClassifier())])

param_grid = {'kpca__n_components':[3, 4, 6, 8],
              'kpca__kernel':['linear', 'rbf', 'poly'],
              'kpca__gamma': np.linspace(0.03, 0.05, 10)}

grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='roc_auc')

grid_search.fit(X,y)

My data set X consists of several metric variables (company key figures), which are to be standardized in the first step, and a few categorical variables, which have already been adjusted with pd.get_dummies. The target variable y is a binary indicator for the event whose probability is to be determined later using the XGBoost model.

Does it make sense to define both procedures in the pipeline, or should the KPCA be carried out separately from the follow-up procedure? And if this is the case, which approach is the right one to define the scoring parameter of the GridSearchCV function? Does this then have to be customized? (Like it is described in this thread: https://github.com/ageron/handson-ml/issues/629)

Original Q&A

How exactly do I have to define the pipeline and the GridSearchCV for an unsupervised learning procedure?

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in PCA

Related Questions in CROSS-VALIDATION

Related Questions in UNSUPERVISED-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions